Configuring Generic MongoDB as a Source

After you have selected Generic MongoDB as the Source for creating the Pipeline, provide the connection settings and data replication details provided here in the Configure Your Source page.

Prerequisites

Configuring Generic MongoDB Settings in Hevo

Provide the following information in the Configure Your Source page during Pipeline creation:

  1. Pipeline Name: A unique name for the Pipeline.

  2. Database Host: IP address or hostname of the database you want to access. If you are trying to connect to a replica set, provide a comma-separated list of all IPs/DNS names in the replica set. For example, cluster0-shard-00-00.example.com, cluster0-shard-00-01.example.com, cluster0-shard-00-02.example.com Hevo always connects to a secondary instance.

    If you have a sharded configuration, provide the MongoDB Router address. Read MongoDB for more information on MongoDB configurations.

    Note: If you do not have this information, reach out to your MongoDB Administrator.

  3. Database User: The authenticated user that can read the collections in your database. Read Setting up Permissions to Read Generic MongoDB Databases.

    Note: It is recommended that only Read-Only permissions be provided to the user.

  4. Database Password: The password for the database user.

  5. Database Port: The port on which your MongoDB server is listening for connections (default is 27017).

  6. Auth DB Name: The database that stores the user’s information. The user name and password entered in the preceding steps are validated against this database. For example, admin.

    Note: As a good practice, you can keep the information of all your users, including their access privileges, in the admin database, which is created by default in MongoDB.

  7. Connection Settings:

    • Connect through SSH: Enable this toggle option to connect to Hevo using an SSH tunnel, instead of directly connecting your MongoDB database host to Hevo. This provides an additional level of security to your MongoDB database by not exposing your MongoDB setup to the public. Read Connecting Through SSH.

      If this option is disabled, you must whitelist Hevo’s IP addresses to allow Hevo to connect to your MongoDB host.

    • Use SSL: Enable this toggle option if you have enabled SSL at the MongoDB server end.

  8. Advance Settings:

    • Load All Databases: If enabled, Hevo fetches data from all your databases on the selected host. If disabled, provide the Database Name to fetch data from.

      Note: You can separate multiple database names with a comma.

    • Merge Collections: If enabled, collections with the same name across different databases are merged into a single Destination table. If disabled, separate tables are created, prefixed with the respective database name. See Example - Merge Collections Feature.

    • Load Historical Data: If disabled, Hevo loads only the data that was written in your database after the time of creation of the Pipeline. If enabled, the entire table data is fetched during the first run of the Pipeline.

Setting up MongoDB Replication for OpLog and Change Streams

Note: Hevo supports data ingestion from MongoDB via OpLog and Change Streams. Change Streams internally use OpLog for replication.

To set up replication for OpLog or Change Streams, perform the following steps:

  1. Modify the MongoDB server configuration: MongoDB configuration file, mongod.conf, is generally found in /etc/ directory in a Linux system. The options to configure are as follow.
    • replication.replSetName: The replica set this MongoDB is part of.
    • replication.oplogSizeMB: The max size of logs that mongo will persist keep this enough high we recommend 2048 MB (2GB).
    • net.bindIp: The IP this MongoDB server should listen to.
    • An example config will look like as follows.

      net:  
       bindIp: 0.0.0.0  
      replication:  
       replSetName: "repSet0"  
       oplogSizeMB: 2048
      
  2. Configuring replication through MongoDB shell: Open your Mongo shell on the replication server and run the following commands
    • rs.initiate(): This command will initialize the replica set.
    • rs.conf(): This command will show you the replication configuration that has been set.
    • rs.status(): This command will show you the replication status.

Setting up Permissions to Read Generic MongoDB Databases

After you have set up MongoDB replication for OpLog and Change Streams, you must assign the required permissions to the Hevo user to read from the different databases. To do this, run the following commands in your Mongo shell:

  •   use <your admin db>
    
  •   db.grantRolesToUser('<Hevo user>',[{ role: "read", db: "<db to replicate>" }])
    
  •   db.grantRolesToUser('<Hevo user>',[{role: "read", db: "local"}])
    

Whitelisting Hevo’s IP Address in Generic MongoDB

In order to allow Hevo to access your MongoDB databases, you must whitelist Hevo’s IP addresses.

To do this, add the IP addresses to the list of authenticated IP Addresses/CIDR of your MongoDB instance by following these simple steps.


See Also

Last updated on 09 Oct 2020