Why do duplicate rows appear in my Destination for Pipelines created using XMIN mode?
On This Page
This happens due to XMIN wraparound in your PostgreSQL database, which can cause previously ingested rows to be incorrectly identified as new, resulting in duplicate records in your Destination.
Understanding XMIN and Wraparound
In PostgreSQL, each row in a table is assigned an XMIN value, which Hevo uses to identify new or updated rows during incremental syncs. XMIN has a maximum limit of 4,294,967,295, and once this limit is reached, it resets. This behavior is known as XMIN wraparound.
To handle this behavior, PostgreSQL uses the VACUUM FREEZE process. As part of this:
-
Old, unchanged rows are updated by replacing their XMIN values with a constant frozen value.
-
PostgreSQL then considers these rows permanently unchanged and stops tracking them for updates.
-
If all the rows on a page are updated in this way, the entire page is also considered frozen.
However, after the wraparound reset, Hevo may incorrectly detect these frozen rows as recently changed and re-ingest them, creating duplicate records in the Destination.
Suggested Actions to Prevent Duplicate Records
-
Recommended: Create a Pipeline using logical replication mode, as it is not affected by XMIN wraparound.
-
Alternative: Create Extensions and Wrapper Functions in your PostgreSQL database as described below to help Hevo identify and skip frozen rows during incremental ingestion.
Create Extensions and Wrapper Functions
To identify and skip frozen pages and rows, Hevo uses the pg_visibility and pageinspect extensions along with wrapper functions. These extensions provide visibility into which pages and rows are frozen. The wrapper functions provide this information to Hevo, allowing it to skip them during incremental ingestion.
Note:
-
This action is not recommended for PostgreSQL database version 16 or later, as the required extensions may not work as expected.
-
XMIN wraparound handling is not supported for partitioned tables.
-
The
pageinspectextension is not supported for Amazon Aurora PostgreSQL. As a result, Hevo skips only frozen pages but may still re-ingest frozen rows on other pages, leading to duplicate records.
Perform the following steps to create the required extensions and wrapper functions:
Note: You must have the Superuser privilege for your PostgreSQL database to perform these steps.
1. Create the pg_visibility extension
This extension provides information about which pages are frozen.
-
Connect to your PostgreSQL database instance using an SQL client tool, such as psql.
-
Run the following command to create the
pg_visibilityextension:CREATE EXTENSION IF NOT EXISTS pg_visibility; -
Run the following command to verify that the extension is created successfully:
SELECT * FROM pg_available_extensions WHERE name = 'pg_visibility' AND installed_version IS NOT NULL;
2. Create the get_all_pages wrapper function
This function enables Hevo to identify and skip frozen pages using the pg_visibility extension.
-
Run the following command to create the function and grant permissions:
CREATE SCHEMA IF NOT EXISTS hevo; CREATE OR REPLACE FUNCTION hevo.get_all_pages(v_table_name varchar) RETURNS TABLE ( pagenumber integer, all_visible_yn boolean, all_frozen_yn boolean ) LANGUAGE plpgsql SECURITY DEFINER AS $$ BEGIN RETURN QUERY SELECT blkno::int AS pagenumber, all_visible AS all_visible_yn, all_frozen AS all_frozen_yn FROM pg_visibility_map(v_table_name::regclass); END; $$; GRANT USAGE ON SCHEMA hevo TO <database_username>; GRANT EXECUTE ON FUNCTION hevo.get_all_pages TO <database_username>;Note: Replace <database_username> with the name of your database user configured in your Pipeline.
-
Run the following command to verify that the wrapper function is created successfully:
SELECT p.proname as name FROM pg_catalog.pg_namespace n JOIN pg_catalog.pg_proc p ON p.pronamespace = n.oid WHERE n.nspname = 'hevo' AND proname IN ('get_all_pages') AND has_function_privilege(p.oid, 'execute') AND has_schema_privilege(n.oid, 'usage');
3. Create the pageinspect extension
This extension provides information about which rows within a page are frozen.
Note: This extension is not supported for Amazon Aurora PostgreSQL. As a result, Hevo skips only frozen pages but may still re-ingest frozen rows on other pages, leading to duplicate records.
-
Run the following command to create the
pageinspectextension:CREATE EXTENSION IF NOT EXISTS pageinspect; -
Run the following command to verify that the extension is created successfully:
SELECT * FROM pg_available_extensions WHERE name = 'pageinspect' AND installed_version IS NOT NULL;
4. Create the get_items_in_page wrapper function
This function enables Hevo to identify and skip frozen rows within a page using the pageinspect extension.
-
Run the following command to create the function and grant permissions:
CREATE OR REPLACE FUNCTION hevo.get_items_in_page( v_table_name varchar, v_page_number integer ) RETURNS TABLE ( lp smallint, lp_off smallint, lp_flags smallint, lp_len smallint, t_xmin xid, t_xmax xid, t_field3 integer, t_ctid tid, t_infomask integer, t_hoff smallint, t_bits text, t_oid oid, t_data bytea ) LANGUAGE plpgsql SECURITY DEFINER AS $$ BEGIN RETURN QUERY SELECT h.lp, h.lp_off, h.lp_flags, h.lp_len, h.t_xmin, h.t_xmax, h.t_field3, h.t_ctid, h.t_infomask, h.t_hoff, h.t_bits, h.t_oid, h.t_data FROM heap_page_items(get_raw_page(v_table_name, v_page_number)) AS h; END; $$; GRANT EXECUTE ON FUNCTION hevo.get_items_in_page TO <database_username>;Note: Replace <database_username> with the name of your database user configured in your Pipeline.
-
Run the following command to verify that the wrapper function is created successfully:
SELECT p.proname as name FROM pg_catalog.pg_namespace n JOIN pg_catalog.pg_proc p ON p.pronamespace = n.oid WHERE n.nspname = 'hevo' AND proname IN ('get_items_in_page') AND has_function_privilege(p.oid, 'execute') AND has_schema_privilege(n.oid, 'usage');
Once the extensions and wrapper functions are in place, Hevo can detect and skip frozen pages and rows during incremental runs, preventing duplicate records in your Destination caused by XMIN wraparound.
Revision History
Refer to the following table for the list of key updates made to this page:
| Date | Release | Description of Change |
|---|---|---|
| May-07-2026 | NA | New Document. |