Partitioning in BigQuery

Google BigQuery offers a serverless way to handle massive datasets through the use of partitioned tables. These are tables that are divided into segments to make it easier to manage and query data. Partitions can improve query performance, and control costs by reducing the number of bytes read by a query.

There are two types of table partitioning in BigQuery:

  • Ingestion time based: Tables are partitioned based on the data’s ingestion (load) date or arrival date.

  • Field based: Tables that are partitioned based on the timestamp/date column.

Ingestion-time Partitioned Tables

When you create a table partitioned by ingestion time, it automatically loads data into date-based partitions that denotes the date when the data arrived.

The tables have a pseudo column named _PARTITIONTIME that contains a date-based timestamp for data that is loaded into the table. By using _PARTITIONTIME in your query, you can restrict the number of partitions scanned. However, some uses of pseudo-columns do not limit the number of partitions scanned, you can find more details on that here.


Field-based Partitioned Tables

These tables allow you to provide any TIMESTAMP or DATE columns of your choice. The data is automatically sent to the appropriate partition based on the date value(in UTC). We don’t need a _PARTITIONTIME pseudo column instead the partitioning column can be used to restrict the amount of data scanned.

When you create field-based partitioned tables, two special partitions are created:

  • The __NULL__ partition representing rows with NULL values in the partitioning column

  • The __UNPARTITIONED__ partition representing data that exists outside the allowed range of dates

For information on partitioned tables and its quotas/limit please go through their official documentation.


Destination Considerations


Using BigQuery Partitioning in Hevo

You can enable partitioning while creating the tables in Schema Mapper or Models. By default, No Partition is selected. To partition a table:

  1. Select the Partition Style.

  2. Select the Partition Type. Default value: Day. From Release 1.67 onwards, you can specify a different partitioning type, such as Hour, Month, and Year. Do remember that you can only partition the table on a field that which is of date or timestamp data type.


See Also


Revision History

Refer to the following table for the list of key updates made to this page:

Date Release Description of Change
Aug-09-2021 1.67 Updated the section, Using BigQuery Partitioning in Hevo.
May-19-2021 NA Added section, Destination Considerations.
Last updated on 11 Oct 2021