Google Drive

Last updated on Jul 03, 2025

User Authentication

You can connect to the Google Drive Source only via service accounts. One service account can be mapped to your entire team. Read Google Account Authentication Methods to know how to set up a service account if you do not already have one.

Setup

Data Sync

Prerequisites

An active Google Drive account from which data is to be ingested.
The Google Drive API is enabled for the service account, if connecting via it.
You are assigned the Team Administrator, Team Collaborator, or Pipeline Administrator role in Hevo, to create the Pipeline.

Configuring Google Drive as a Source

Perform the following steps to configure Google Drive as a Source in your Pipeline:

Click PIPELINES in the Navigation Bar.
Click + CREATE PIPELINE in the Pipelines List View.
On the Select Source Type page, select Google Drive.
On the Configure your Drive account page, do one of the following:
- Select a previously configured account and click CONTINUE.
- Click the attach icon () to upload the Service Account Key and click CONFIGURE DRIVE ACCOUNT.
  
  Note: Hevo supports only JSON format for the key file.
On the Configure your Drive Source page, specify the following:
- Pipeline Name: A unique name for your Pipeline, not exceeding 255 characters.
- Folders: Select the check box next to each folder whose files you want to replicate to the Destination system. Hevo supports the Excel (.xlsx), Google Sheets, CSV, and TSV files to ingest data. By default, Hevo ingests data from all supported files present in the selected folders.
  
  Note:
  - If a folder contains a child folder, you must explicitly select it to replicate its files to the Destination system.
  - If your Source files are in a different format, you can export them to either of the supported formats and then ingest the data.
- Enable the Convert date/time format fields to timestamp option if you want to convert the date/time format within the files of selected folders to timestamp. For example, the date/time format 07/11/2022, 12:39:23 converts to timestamp 1667804963.
Click CONTINUE.
Proceed to configuring the data ingestion and setting up the Destination.

Schema and Data Model

The following are some important considerations regarding the schema and the data model used by Hevo to ingest and upload the Google Drive data to the Destination:

Hevo treats each Google Drive folder as a Pipeline object.
Google Sheets: Hevo uses the naming convention foldername_worksheetname to name the Event Types, where foldername is the name of the Google Drive folder containing the Google Sheet or Excel spreadsheet, and worksheetname is the name of the worksheet tab within the spreadsheet. For example, if your folder name is Westeros and the worksheet name is usa, the name of the Event Type becomes Westeros_usa.

For Google Sheet or Excel files that contain multiple worksheet tabs:
- If the worksheet names are different across different Google Sheets, Hevo ingests each worksheet as an individual Event Type and replicates its data to corresponding tables in the Destination. For example, if there are two worksheets named Westeros Sheet 1 and Westeros Sheet 2 under two different Google Sheets respectively, then Hevo ingests these worksheets as individual Event Types with the names westeros_sheet_1 and westeros_sheet_2 respectively.
  
  Note: Each Event Type consists of all the fields in the corresponding worksheet.
- If the worksheet names are same across multiple Google Sheets, Hevo ingests all the worksheets as a single Event Type and replicates them to one table in the Destination. For example, if there are two worksheets with the same name Westeros Sheet 1 under two different Google Sheets respectively, then Hevo ingests these worksheets as one Event Type with the name westeros_sheet_1, and the data in these sheets is ingested and loaded into one table in the Destination.
- If the worksheets contain duplicate column headers, Hevo updates the column headers as mentioned in section, Treatment of Duplicate Column Headers, and then ingests and loads them to your Destination.
CSV and TSV Files: Hevo ingests data from all the files of a Google Drive folder as one Event Type. The name of the Event Type is the name of the Drive folder. For example, if the name of your Google Drive folder is Westeros, all CSV files are ingested into the Event Type Westeros.
__hevo_id, __hevo_ingested_at, and __hevo_loaded_at are additional columns that Hevo creates in the Destination system while replicating the Google Drive data. Read Hevo-generated metadata.

Additional Information

Read the detailed Hevo documentation for the following related topics:

Treatment of Duplicate Column Headers

To identify each column in the Source uniquely during ingestion, Hevo renames any duplicate column headers that are found, as:

Columns with the same header: The column header is changed to lowercase and the column number is suffixed to it. For example, if there are three columns, test, test, and test in columns G, H, and K, respectively, these are changed to test_g, test_h, and test_k. The same happens for headers with space between them. For example, if there are three columns, Software Used, Software Used, and Software Used in columns H, I, and J, respectively, these are changed to software_used_h, software_used_i, and software_used_j.
Columns with same header but different case: All column headers are changed to lowercase and assigned a sequential, numeric suffix. For example, if there are three columns, test, Test, and tEst, these are changed to test, test_1, and test_2 during the auto mapping phase. The same happens for headers with space between them. For example, if there are two columns, Software Used, and Software used, these are changed to software_used, and software_used_1.
Columns with __Hevo_id as the header: The column header is changed to __hevo_id, and the __hevo_id column that Hevo adds to keep track of offsets during ingestion is named as __hevo_id_1.
Columns with __hevo_id as the header: Hevo suffixes the column number to the column header. For example, if the __hevo_id_ column lies in column number B, it is renamed to __hevo_id_b. If the __hevo_id_b column is also present in the Source column list, the original __hevo_id column is named to __hevoid_b_b. This is done because Hevo adds the column, __hevo_id_ as a primary key to keep track of offsets during the ingestion process. Towards this purpose, any existing column with that name is renamed.

Source Considerations

The Google Drive Files: export API cannot export Google Sheets larger than 10MB. As a result, Hevo cannot ingest such sheets.

Limitations

Hevo does not support the .xls format of Excel files.
Hevo sets a limit of 50MB for Excel sheets and 5GB for CSV and TSV files. If you need to ingest larger CSV and TSV files, you can contact Hevo Support and get the limits increased for your team.
If the names of any of the Drive folders are modified after the Pipeline is created, then the new Events are ingested with a different Event name that is derived from the new folder name.
Hevo does not load data from a column into the Destination table if its size exceeds 16 MB, and skips the Event if it exceeds 40 MB. If the Event contains a column larger than 16 MB, Hevo attempts to load the Event after dropping that column’s data. However, if the Event size still exceeds 40 MB, then the Event is also dropped. As a result, you may see discrepancies between your Source and Destination data. To avoid such a scenario, ensure that each Event contains less than 40 MB of data.
Hevo currently supports only the ISO 8859-1 character set. If the file name or content includes unsupported characters, such as emojis or special symbols, the Pipeline may fail. If your file uses a different character set, contact Hevo Support.
If you initially select a parent folder for ingestion and later reconfigure the Source to select specific child folders within it, Hevo treats each newly selected child folder as a separate ingestion path. This triggers a new historical load for those folders, even if their data was previously ingested through the parent folder.

To avoid duplicate ingestion, you can skip the historical load and configure the child folders to resume from a specific point using the Change Position option. Any data ingested using the Change Position action is billable.

Revision History

Refer to the following table for the list of key updates made to this page:

Date	Release	Description of Change
Jul-07-2025	NA	Updated the Limitations section to inform about the max record and column size in an Event.
Jun-02-2025	NA	Updated section, Limitations to add a point about new historical loads when selecting child folders for ingestion.
May-08-2025	NA	Updated section, Limitations to add a point about Hevo supporting only ISO-8859-1 encoding format.
Jan-07-2025	NA	Updated the Limitations section to add information on Event size.
Aug-19-2024	NA	Updated section, Configuring Google Drive as a Source to remove mentions of user account-based authentication.
Aug-07-2023	NA	- Updated section, Configuring Google Drive as a Source to add information about the supported file formats. - Updated section, Limitations to add information about Hevo not supporting the `.xls` file format.
Feb-21-2023	NA	Updated sections, Source Considerations and Limitations to add information about file size limit.
Nov-08-2022	NA	Updated section, Configuring Google Drive as a Source to add information about the Convert date/time format fields to timestamp option.
Sep-21-2022	NA	Updated section, Schema and Data Model to add examples for the scenarios.
Apr-11-2022	NA	Updated screenshot in the Configuring Google Drive as a Source section to reflect the latest UI.
Apr-11-2022	1.86	Updated sections, Schema and Data Model and Limitations to reflect support for TSV file format.
Apr-11-2022	1.86	Added section, Treatment of Duplicate Column Headers.
Mar-21-2022	1.85	Updated section, Limitations to remove the point about Hevo not supporting UTF-16 encoding format for CSV files.
Mar-07-2022	1.83	Added content in the section, Schema and Data Model, regarding worksheets with duplicate column headers.
Dec-06-2021	1.77	Removed the limitation of not ingesting data from shared Drives, as Hevo supports it now.
Sep-09-2021	NA	Added the limitation about Hevo not supporting ingestion of shared Drives. Read Limitations.
Aug-8-2021	NA	Added a note in the Source Considerations section about Hevo deferring data ingestion in Pipelines created with this Source.
Jul-26-2021	NA	Added a note in the Overview section about Hevo providing a fully-managed Google BigQuery Destination for Pipelines created with this Source.
Jun-28-2021	1.66	Updated the page overview with information about `__hevo_source_modified_at` being uploaded as a metadata field from Release 1.66 onwards.
May-05-2021	1.62	- Included steps to connect to Drive using a service account. - Updated the document as per the latest UI and functionality.
Feb-22-2021	NA	Added the limitation about Hevo not supporting UTF-16 encoding format for CSV data. Read Limitations.