ON THIS PAGE
MongoDB is a document-oriented NoSQL database that supports high volume data storage. MongoDB makes use of collections and documents instead of tables and rows to organize data. Collections, equivalent of tables in an RDBMS, hold sets of documents. Documents, or objects, are similar to records of an RDBMS. Insert, update, and delete operations can be performed on a collection within MongoDB.
Hevo supports two ways of configuring MongoDB:
- MongoDB Atlas - This option is applicable if your MongoDB database is hosted on MongoDB Atlas.
- Generic MongoDB - This option is applicable for all deployments of MongoDB other than MongoDB Atlas.
The following table lists the differences between the two:
|Generic MongoDB||MongoDB Atlas|
|Managed by the user or a third-party other than MongoDB Atlas.||Fully managed database service from MongoDB.|
|User must retrieve connection details like the list of all databases in the cluster, port number, and
||User needs to provide only the hostname from MongoDB. This URL encapsulates all the required details, such as, the database IDs,
MongoDB supports three configurations:
Standalone without replicas: Includes a single instance of MongoDB.
Standalone with replicas: Includes one primary instance of MongoDB. Secondary instances follow the primary one by replicating data from it. While configuring MongoDB as a Source, you must provide a comma separated list of all the instances in the Database Host field.
Clustered: Includes three components - a shard router (mongos), a config server (mongod), and a data shard (mongod). The data shard can individually have replicas for redundancy. While configuring MongoDB as a Source, you must provide a comma separated list of all instances of the MongoDB Routers in the Database Host field.
MongoDB version 4.0 or above, if the Pipeline mode is Change Streams. OpLog is compatible with all versions of MongoDB.
Use the following command to find out the MongoDB version on Ubuntu.
~$ mongod --version
A MongoDB user with
readaccess to the database that is to be replicated and to the
A retention period of at least 72 hours or more in the OpLog to ensure the OpLog does not get purged before Hevo can read it.(Recommended). Read OpLog Alerts.
Configuring MongoDB as a Source in Hevo
To configure MongoDB as the Source for your Pipeline in Hevo, click CREATE PIPELINE in the Pipeline Overview page, select MongoDB as the Source, and specify the following settings:
1. Pipeline Mode
Select how you want Hevo to read your data from the MongoDB Source:
OpLog: Data is polled using MongoDB’s OpLog. The OpLog is a collection of individual, transaction-level details which help replicas sync data from the primary instance.
Note: OpLogs are present in data/standalone primary instances and replicas.
Change Streams: MongoDB’s Change Streams enable applications to stream real-time data changes without the complexity and risk of tailing the OpLog, for a single collection, a database, or an entire deployment. Change Streams are supported for all MongoDB configurations. However, for the clustered configuration, Change Streams works only if set up against a shard router (mongos).
Read OpLog Alerts.
2. MongoDB Service Provider
Select the MongoDB service provider that you use to manage your MongoDB databases:
- Generic Mongo Database: Database management is done at your end, or by a service provider other than MongoDB Atlas.
- MongoDB Atlas: The managed database service from MongoDB.
3. MongoDB Connection Settings
Refer to the following sections based on your MongoDB deployment:
Generic MongoDB. Read Configuring Generic MongoDB as a Source.
MongoDB Atlas. Read Configuring MongoDB Atlas as a Source.
If the OpLog retention period is set as less than 24 hours, a one-time warning is displayed in a snack-bar in your MongoDB Pipeline, OpLog may expire in
> hours. We recommend having enough space to retain the OpLog for at least 24 hours to avoid disruption in replication due to spikes.
- Articles in this section