Generic MongoDB

After you select Generic MongoDB as the Source for creating the Pipeline, provide the connection settings and data replication details in the Configure your MongoDB Source page.

You can connect to the MongoDB database in one of the following ways:

  • By specifying the individual connection fields such as database host, database port, username, and password.

  • By entering the connection URI to connect to your MongoDB replica set or sharded cluster.

    Connection URIs are of two types:

    • DNS Seedlist: This type of connection URI has the prefix mongodb+srv://.

      For example, mongodb+srv://<Jerome>:<Hevo123>@demo.westeros.net/.

      The +srv part indicates that the hostname corresponds to a DNS SRV record. The hostnames and port values for your MongoDB database are fetched from the DNS SRV record.

    • Standard Connection String: This type of connection URI has the prefix mongodb://. This contains a comma-separated list of host:port combinations.

      For example, mongodb://Jerome:Hevo123@demo1.westeros.net:21275,demo2.westeros.net:21275/.

      In the above example, Jerome is the database user, Hevo123 is the password for the database user, demo1.westeros.net is the IP address or hostname of the database, and 21275 is the database port.

Prerequisites


Perform the following steps to configure your Generic MongoDB Source:

Set up MongoDB Replication for OpLog and Change Streams

Note: Hevo supports data ingestion from MongoDB via OpLog and Change Streams. Change Streams internally use OpLog for replication.

To set up replication for OpLog or Change Streams, perform the following steps:

  1. Modify the MongoDB server configuration: MongoDB configuration file, mongod.conf, is generally found in /etc/ directory in a Linux system. The options to configure are as follow.

    • replication.replSetName: The replica set this MongoDB is part of.

    • replication.oplogSizeMB: The max size of logs that mongo will persist keep this enough high we recommend 2048 MB (2GB).

    • net.bindIp: The IP this MongoDB server should listen to.

      An example config will look like as follows:

      net:  
       bindIp: 0.0.0.0  
      replication:  
       replSetName: "repSet0"  
       oplogSizeMB: 2048
      
  2. Configure replication through MongoDB shell: Open your Mongo shell on the replication server and run the following commands:

    • rs.initiate(): This command will initialize the replica set.

    • rs.conf(): This command will show you the replication configuration that has been set.

    • rs.status(): This command will show you the replication status.


Set up Permissions to Read Generic MongoDB Databases

After you set up replication for OpLog and Change Streams, you must assign read privilege to the database user to read from the local database and the databases to be replicated. To do this:

  1. Open the MongoDB shell.

  2. Connect to your replica set or sharded cluster as an admin user.

  3. Depending on the MongoDB version, run the following commands to create a user and assign it permissions.

    • For MongoDB versions between 2.4 and 2.6:

       use admin
       db.addUser(
       { user: "<username>",
       pwd: "<password>",
       roles: ["readAnyDatabase"]
          }
       )
       Replace <username> and <password> with a username and password of your choice.
      
    • For MongoDB versions between 3.0 and 3.2:

       use admin
       db.createUser({
       user: "<username>",
       pwd: "<password>",
       roles: [ "readAnyDatabase" ]
       })
       Replace <username> and <password> with a username and password of your choice.
      
    • For MongoDB version v3.2 and later:

       use admin
       db.createUser({
       user: "<username>",
       pwd: "<password>",
       roles: [ "readAnyDatabase", {role: "read", db: "local"} ]
       })
       Replace <username> and <password> with a username and password of your choice.
      

Whitelist Hevo’s IP Addresses

In order to allow Hevo to access your MongoDB databases, you must whitelist Hevo’s IP addresses.

To do this, add the IP addresses to the list of authenticated IP Addresses/CIDR of your MongoDB instance by following these simple steps.


Configure Generic MongoDB Connection Settings

Perform the following steps to configure Generic MongoDB as a Source in Hevo:

Configure Source

  • Pipeline Name: A unique name for the Pipeline, not exceeding 255 characters.

  • General Connection Settings:

    • Paste Connection String:

      • Connection URI: The unique identifier for connecting to a MongoDB replica set or a sharded cluster.
    • Enter Connection Fields Manually:

      • Database Host: IP address or hostname of the database you want to access. To specify multiple database hosts, provide a comma-separated list of all IPs/DNS names.

        For example, cluster0-shard-00-00.example.com, cluster0-shard-00-01.example.com, cluster0-shard-00-02.example.com is an example of a sharded cluster. Hevo always connects to a secondary instance.

        Note: For URL-based hostnames, exclude the protocol part, for example, mongodb:// or mongodb+srv://.

      • Database User: The authenticated user that can read the collections in your database. Read Set up permissions to read generic MongoDB databases.

      • Database Password: The password for the database user.

      • Database Port: The port on which your MongoDB server is listening for connections. Default: 27017.

  • Authentication Database Name: The database that stores the user’s information. The user name and password entered in the preceding steps are validated against this database. Default: admin.

  • Connection Settings:

    • Connect through SSH: Enable this toggle option to connect to Hevo using an SSH tunnel, instead of directly connecting your MongoDB database host to Hevo. This provides an additional level of security to your MongoDB database by not exposing your MongoDB setup to the public. Read Connecting Through SSH.

      If this option is disabled, you must whitelist Hevo’s IP addresses to allow Hevo to connect to your MongoDB host.

    • Use SSL: Enable this toggle option if you have enabled SSL at the MongoDB server end.

  • Advanced Settings:

    • Load All Databases: If enabled, Hevo fetches data from all the databases you have access to on the selected host. If disabled, provide the database name to fetch data from. In the case of OpLog, you can specify a comma-separated list of database names.

    • Merge Collections: If enabled, collections with the same name across different databases are merged into a single Destination table. If disabled, separate tables are created, prefixed with the respective database name. See Example - Merge Collections Feature.

    • Load Historical Data: If disabled, Hevo loads only the data that was written in your database after the time of the creation of the Pipeline. If enabled, the entire table data is fetched during the first run of the Pipeline.

    • Include New Tables in the Pipeline: Applicable for all Pipeline modes except Custom SQL. If enabled, Hevo automatically ingests data from tables created after the Pipeline has been built. If disabled, the new tables are listed in the Pipeline Detailed View in Skipped state, and you can manually include the ones you want and load their historical data.

    You can change this setting later.


Limitations

  • Hevo does not support configuring a standalone instance of MongoDB without a replica.

See Also


Revision History

Refer to the following table for the list of key updates made to this page:

Date Release Description of Change
Jul-12-2021 1.67 Added the field Include New Tables in the Pipeline under Source configuration settings.
Jun-28-2021 1.66 - Updated the page overview section.
- Updated the section Set up Permissions to Read Generic MongoDB Databases to include latest commands.
- Updated the section Configure Generic MongoDB Connection Settings to include the option to connect to the MongoDB database using connection string.
- Added section, Limitations.
Last updated on 20 Jul 2021