Google Cloud Storage (GCS)
ON THIS PAGE
Hevo lets you load data from files in your GCS bucket into your data warehouse.
Connection Settings
Connect the Google Account having access to GCS bucket you intend to connect.
- Pipeline Name: A unique name for this Pipeline.
- Bucket: Name of the bucket from which you want to ingest data.
- Path Prefix: Path prefix for the data directory. By default, the files are listed from the root of the directory.
- File Format: Choose a file format. Hevo currently supports AVRO, CSV, JSON, and XML formats. Contact Hevo Support if your Source data is in a different format.
Based on the format you select, you must specify some additional settings:
- CSV:
- Specify the Field Delimiter. This is the character on which fields in each line are separated. For example, `\t`, or `,`).
-
Disable the Treat First Row As Column Headers option if the Source data file does not contain column headers. Hevo, then, automatically creates these during ingestion. Default setting: Enabled.
See Example below. - Enable the Create Event Types from folders option if the path prefix has subdirectories containing files in different formats. Hevo reads each of the subdirectories as a separate Event Type.
Note: Files lying at the prefix path (and not in a subdirectory) are ignored.
- JSON:
-
Enable the Create Event Types from folders option if the path prefix has subdirectories containing files in different formats. Hevo reads each of the subdirectories as a separate Event Type.
Note: Files lying at the prefix path (and not in a subdirectory) are ignored.
-
- XML:
- Enable the Create Events from child nodes option to load each node under the root node in the XML file as a separate Event.
- CSV:
Things to Note
- Gzipped files are automatically unzipped on ingestion by Hevo.
- Files are re-ingested on update.
Example: Automatic Column Header Creation for CSV Tables
Consider the following data in CSV format, which has no column headers.
CLAY COUNTY,32003,11973623
CLAY COUNTY,32003,46448094
CLAY COUNTY,32003,55206893
CLAY COUNTY,32003,15333743
SUWANNEE COUNTY,32060,85751490
SUWANNEE COUNTY,32062,50972562
ST JOHNS COUNTY,846636,32033,
NASSAU COUNTY,32025,88310177
NASSAU COUNTY,32041,34865452
If you disable the Treat first row as column headers option, Hevo auto-generates the column headers, as seen in the schema map here:
The record in the Destination appears as follows:
Limitations
- Hevo does not support UTF-16 encoding format for CSV files. As a workaround, you can convert the files to UTF-8 encoding format before these are ingested by the Pipeline.
Revision History
Refer to the following table for the list of key updates made to the page:
Date | Release No. | Description of Change |
---|---|---|
22-Feb-2021 | NA | Added the limitation about Hevo not supporting UTF-16 encoding format for CSV data. Read Limitations. |