On This Page
Salesforce is a cloud computing Service as a Software (SaaS) company that allows you to use cloud technology to connect more effectively with customers, partners, and potential customers.
For creating Pipelines using this Source, Hevo provides you a fully managed BigQuery data warehouse as a possible Destination. This option remains available till the time you set up your first BigQuery Destination irrespective of any other Destinations that you may have. With the managed warehouse, you are only charged the cost that Hevo incurs for your project in Google BigQuery. The invoice is generated at the end of each month and payment is recovered as per the payment instrument you have set up. You can now create your Pipeline and directly start analyzing your Source data. Read Hevo Managed Google BigQuery.
Hevo uses Salesforce’s Bulk API to replicate the data from your Salesforce applications to the Destination database or data warehouse. To enable this, you need to authorize Hevo to access data from the relevant Salesforce environment.
Salesforce allows businesses to create accounts in multiple environments, such as:
Production: This is the environment that holds live customer data and is used to actively run your business. A production org is identified by URLs starting with https://login.salesforce.com.
Sandbox: This is a copy of your production organization. You can create multiple sandbox environments for different purposes, such as development and testing. Working in the sandbox eliminates any risk of compromising your production data and applications. A sandbox is Identified by URLs starting with https://test.salesforce.com.
An active Salesforce production account or sandbox account.
History tracking is enabled to track history objects.
Configuring Salesforce as a Source
Perform the following steps to configure Salesforce as a Source in your Pipeline:
Click PIPELINES in the Asset Palette.
Click + CREATE in the Pipelines List View.
In the Select Source Type page, select Salesforce.
In the Configure Your Salesforce Account page, click on + Add Salesforce Account or + Add Another Account (if an account already exists).
Select the environment from which Hevo must ingest the data, and then, click Continue.
Log in to your Salesforce account.
Click Allow to authorize Hevo to access your Salesforce Account.
You are then redirected to the Configure Your Salesforce Account page.
Specify the Pipeline Name, and then, click CONTINUE.
Proceed to configuring the data ingestion and setting up the Destination.
|Default Pipeline Frequency||Minimum Pipeline Frequency||Maximum Pipeline Frequency||Custom Frequency Range (Hrs)|
|3 Hrs||15 Mins||24 Hrs||1-24|
The custom frequency must be set in hours, as an integer value. For example, 1, 2, 3 but not 1.5 or 1.75.
If your Salesforce account does not contain any new Events to be ingested, Hevo defers the data ingestion for a pre-determined time. Hevo attempts to fetch the data again only after the deferment period elapses. Read Deferred Data Ingestion.
Hevo loads all the objects linked with your Salesforce account.
- Historical Data: Once you create the Pipeline, all data associated with the Salesforce account is ingested by Hevo and loaded into your Destination.
- Incremental Data: Once the historical load is completed, all new and updated records are synchronized with your Destination.
Derived fields, that is, fields that derive their value from other fields or formulas, are not updated in the Destination during incremental loads even if their value changes due to a change in the formula or the original field.
In Salesforce, whenever any change occurs in an object, its
SystemModStamptimestamp field is updated. Hevo uses this
SystemModStampfield to identify Events for incremental ingestion. In case of derived fields, a change in the formula or the original field does not affect the object’s
SystemModStampvalue. Due to this, such objects are not picked up in the incremental load. However, if another field in the object is updated alongside, then the subsequent incremental load picks up the derived field updates also.
As a workaround, you can restart the historical load for the object. If the object was created after Pipeline creation, you need to restart the historical load at the Pipeline level.
When a record from a replicable object is deleted in Salesforce, the IsDeleted column for it is set to True. Salesforce moves the deleted records to the Salesforce Recycle Bin, and they are not displayed in the Salesforce dashboard. Now, when Hevo starts the data replication from your Source, using either the Bulk APIs or REST APIs, it also replicates data from the Salesforce Recycle Bin to your Destination. As a result, you might see more Events in your Destination than the Source.
Salesforce retains deleted data in its Recycle Bin for 15 days. Thus, if your Pipeline is paused for more than 15 days, Hevo cannot replicate the deleted data to your Destination. Apart from this, Salesforce automatically purges the oldest records in the Recycle Bin every two hours if the number of records in the Recycle Bin exceeds the record limit for your organization, which is 25 times your organization’s storage capacity. Therefore, to capture information about the deleted data, you must run the Pipeline within two hours of deleting the data in Salesforce.
The maximum number of Events that can be ingested per day is calculated based on your organization’s quota of batches.
Suppose your organization is allocated a daily quota of 15000 batches per 24 hours, and each batch can contain a maximum of 10000 Events.
Then, the daily Event consumption is calculated as follows:
The number of batches created per Object (X) = Number of Events for the Object/10000.
Note: This value, X is rounded off to the next integer.
The total number of batches created across all Objects in the Pipeline (Y) = Sum of the number of batches created for each Object (ΣX).
This number, Y is the number of batches that are submitted in one run of the Pipeline. This number may vary in each run of the Pipeline and is calculated as follows:
The number of Pipeline runs in a day (Z) = 24/Ingestion frequency (in hours).
The number of batches that can be submitted in a day = 15000
The maximum number of batches that can be submitted in one run of the Pipeline = 15000/Z.
Suppose you have two Objects containing 55800 and 25000 Events respectively, and the ingestion frequency is 12 hours. Then,
The number of batches created for Object 1 (X1) = 55800/10000 = 5.58.
Therefore, six batches are created; five with 10000 Events each and the sixth with 5800 Events.
The number of batches created for Object 2 (X2) = 25000/10000 = 2.5.
Therefore, three batches are created; two with 10000 Events each and the third with 5000 Events.
The total number of batches created across all Objects in the Pipeline (Y) = X1 + X2 = 6 + 3 = 9.
These nine batches are submitted in one run of the Pipeline.
Now, as the Ingestion frequency is 12 hours,
The total number of Pipeline runs in 24 hours (Z) = 24/12 = 2.
The maximum number of batches that can be submitted in one run of the Pipeline = 15000/2 = 7500.
Here, against the available limit of 7500 batches per Pipeline run, only 9 batches are being submitted.
Therefore, as long as
Z x Y <= 15000, you are within the daily prescribed quota.
Schema and Primary Keys
Hevo uses the following schema to upload the records in the Destination:
Hevo uses the following data model to ingest data from your Salesforce account:
|Account||Represents information about the company or business user.|
|Campaign||Represents campaigns and tracks their efficiency with cost, revenue, and converted leads analysis.|
|Contact||Represents a company or a person associated with an account that can become a potential customer.|
|Event||Represents an event in the calendar.|
|Lead||Tracks valuable prospects apart from contacts, and convert them into opportunities.|
|Opportunity||Tracks and stores your deals in progress.|
|Product||Represents a product your company sells.|
|Custom Objects||Represents custom objects, entities that support custom objects, and their standard fields, named with a suffix
|StandardObjectNameShare||Represent a model for all share objects associated with standard objects.|
|StandardObjectNameHistory||Represent a model for all the history objects associated with standard objects.|
|StandardObjectNameOwnerSharingRule||Represent a model for all owner sharing rule objects associated with standard objects.|
|StandardObjectNameFeed||Represent a model for all the feed objects associated with standard objects.|
- Hevo does not fetch any columns of Compound data type.
Refer to the following table for the list of key updates made to this page:
|Date||Release||Description of Change|
|May-11-2022||NA||Added a Source consideration about derived fields not getting picked up during incremental loads and the workaround to ingest the associated Events.|
|Mar-07-2022||NA||- Updated and organized the content in the section, Source Considerations.
- Removed the bulk APIs limitation as REST APIs are also supported now.
|Jan-07-2022||1.79||Added information about configurable historical sync duration in the Data Replication section.|
|Jan-03-2022||1.79||Added information about reverse historical load in the Data Replication section.|
|Oct-25-2021||NA||Added the Pipeline frequency information in the Data Replication section.|
|Oct-04-2021||1.73||Updated the Source Consdierations section with an example of calculating quota usage.|
|Sep-09-2021||NA||Updated the Limitations section to remove the limitations around ingestion of the
|Aug-8-2021||NA||Added a note in the Source Considerations section about Hevo deferring data ingestion in Pipelines created with this Source.|
|Jul-26-2021||NA||Added a note in the Overview section about Hevo providing a fully-managed Google BigQuery Destination for Pipelines created with this Source.|
|Feb-22-2021||1.57||Include the setup guide on the Salesforce Source configuration UI.|