On This Page
Elasticsearch is a distributed, RESTful search and analytics engine that centrally stores your data so you can search, index, and analyze data of all shapes and sizes.
Hevo connects to your Elasticsearch cluster using the Elasticsearch Rest Client and synchronizes the data available in the cluster to your preferred data warehouse using indices.
Elasticsearch version greater than 7.0. View versions.
There is at least one sortable field in each document. To be sortable, the fields can be of any of these types:
The database username and password are available, if your ElasticSearch host uses Native Realm authentication.
Perform the following steps to configure your Elasticsearch Source:
Retrieve the Hostname
For self-hosted or cloud-based ElasticSearch databases, contact your system admin to know the database hostname and port.
For AWS ElasticSearch services, contact your service provider.
Obtain Username and Password (optional)
The Elastic Stack security features authenticate users by using realms and one or more token-based authentication services. Currently Hevo’s Elasticsearch integration supports only Native Realm authentication.
Contact your system administrator for obtaining the username and password, if you do not have these details.
Configure Elasticsearch Connection Settings
Perform the following steps to configure Elasticsearch as the Source in Hevo:
Click PIPELINES in the Asset Palette.
Click + CREATE in the Pipelines List View.
In the Select Source Type page, select Elasticsearch.
In the Configure your Elasticsearch Source page, specify the following:
Pipeline Name: A unique name for your Pipeline, not exceeding 255 characters.
Database Host: The Elasticsearch database host’s IP address or DNS.
Database Port: The port on which your Elasticsearch server is listening for connections. Default value: 9200.
Database User: The authenticated user that can read the tables in your database.
Database Password: The password for the database user.
Connect through SSH: Enable this option to connect to Hevo using an SSH tunnel, instead of directly connecting your Elasticsearch database host to Hevo. This provides an additional level of security to your database by not exposing your database setup to the public. Read Connecting Through SSH.
If this option is disabled, you must whitelist Hevo’s IP addresses to allow Hevo to connect to your Elasticsearch host.
Load Historical Data: If this option is enabled, the entire table data is fetched during the first run of the Pipeline. If disabled, Hevo loads only the data that was written in your database after the time of creation of the Pipeline.
Click TEST & CONTINUE.
Proceed to configuring the data ingestion and setting up the Destination.
Historical Load: When you create the Pipeline, Hevo fetches all the data available in the Source database. However, you can limit the number of Events ingested in each run of the Pipeline to maintain the processing load on your cluster.
Incremental Data - New and changed data is fetched every 15 minutes by default. You can configure this frequency using the Change Schedule option in the Pipeline Summary Bar.
Elasticsearch does not have the capability to expose each document modification.Therefore, in order to have at least one incrementing column of sortable type is needed mandatorily, the identitycolumn is used as the tiebreaker if the sortable field is the same for more than one document.
_idfield created by default is used if none is specified.
- Only Native Realm authentication is supported.