On This Page
Amplitude Analytics helps generate thorough product analytics of web and mobile application usages to help you make data driven decisions. You can replicate data from your Amplitude account to a database, data warehouse, or file storage system using Hevo Pipelines.
Note: Hevo fetches data from Amplitude Analytics in a zipped folder to perform the data query.
For creating Pipelines using this Source, Hevo provides you a fully managed BigQuery data warehouse as a possible Destination. This option remains available till the time you set up your first BigQuery Destination irrespective of any other Destinations that you may have. With the managed warehouse, you are only charged the cost that Hevo incurs for your project in Google BigQuery. The invoice is generated at the end of each month and payment is recovered as per the payment instrument you have set up. You can now create your Pipeline and directly start analyzing your Source data. Read Hevo Managed Google BigQuery.
- An active account on Amplitude with access to at least one project.
Configuring Amplitude Analytics as a Source
Perform the following steps to configure Amplitude Analytics as the Source in your Pipeline:
Click PIPELINES in the Asset Palette.
Click + CREATE in the Pipelines List View.
In the Select Source Type page, select Amplitude Analytics.
In the Configure your Amplitude Analytics Source page, specify the following:
Pipeline Name: A unique name for your Pipeline.
API Key: The API key you retrieved from your Amplitude account.
Secret Key: The secret key you retrieved from your Amplitude account.
Historical Sync Duration: The duration for which the existing data in the Source must be ingested. Default value: 3 Months.
Click TEST & CONTINUE.
Proceed to configuring the data ingestion and setting up the Destination.
Retrieving the Amplitude API Key and Secret
Log in to your Amplitude account.
In the left navigation pane, scroll down and click Settings.
In the Org Settings page, click Projects in the left pane, and select a project whose data you would like to sync:
In the project details, copy the API Key and Secret Key shown on the screen, and save these securely:
|Default Pipeline Frequency||Minimum Pipeline Frequency||Maximum Pipeline Frequency||Custom Frequency Range (Hrs)|
|1 Hr||1 Hr||24 Hrs||1-24|
The custom frequency must be set in hours, as an integer value. For example, 1, 2, 3 but not 1.5 or 1.75.
The size of the zipped folder from the Source must not exceed 4 GB, else, the data query fails with an exception, Invalid CEN header. In case of exception, Hevo automatically adjusts the ingestion duration of the historical load and the incremental data, ingesting the data in smaller zip files over multiple cycles.
Historical Data : In the first run of the Pipeline, Hevo ingests the historical data for all the objects. The data is ingested based on the historical sync duration selected while creating the Pipeline, and loaded to the Destination. Default duration: 3 Months.
Incremental Data: Once the historical load is complete, all new and updated records are synchronized with your Destination as per the Pipeline frequency.
The following is the list of tables (objects) that are created at the Destination when you run the Pipeline.
|Cohort||A list of all unique behavioural cohorts created within Amplitude|
|Event||An action that a user takes in your product. This could be anything from pushing a button, completing a level, or making a payment|
|Event Category||All event data is mapped to an Event Category entity which helps to categorise and describe live events and properties.|
|Event Type||All events are mapped to an Event Type entity which is maintained in this table.|
|Group||Each grouping of users that is created in Amplitude along with their dedicated name and description.|
|User||Any person who has logged at least one event and to whom events are attributed.|
|User Cohort||A mapping between User and the User Cohort they belong in.|
|User Group||Groups of users defined by their actions within a specific time period.|
Schema and Primary Keys
Hevo uses the following schema to upload the records in the Destination:
The User object defines each unique user through a combination of User ID, Amplitude ID, and Device ID. You can reference these three columns while making joins to the Event object.
There is a two hour delay in the data exported from Amplitude Analytics getting loaded into your data warehouse.
For example, data sent between 8-9 PM begins to load at 9 PM and becomes available in your Destination after 11 PM, depending on the load frequency you have set.
Refer to the following table for the list of key updates made to this page:
|Date||Release||Description of Change|
|Jun-21-2022||1.91||- Modified section, Configuring Amplitude Analytics as a Source to reflect the latest UI changes.
- Updated the Pipeline frequency information in the Data Replication section.
|Mar-07-2022||1.83||Updated the introduction paragraph and the section,Data Replication, about automatic adjustment of ingestion duration.|
|Oct-25-2021||NA||Added the Pipeline frequency information in the Data Replication section.|
|Apr-06-2021||1.60||- Added a note to the section Schema and Primary Keys
- Updated the ERD. The