Elasticsearch

Last updated on Nov 13, 2024

Elasticsearch is a distributed, RESTful search and analytics engine that centrally stores your data so you can search, index, and analyze data of all shapes and sizes. As Elasticsearch relies on indices to search and fetch documents from your data, it preempts operations that may cause memory issues and stops them with exceptions. Hevo parses some of these exceptions and recommends corrective actions. Read Configuration Changes in Elasticsearch to know about these.

Hevo connects to your Elasticsearch cluster using the Elasticsearch Transport Client and synchronizes the data available in the cluster to your preferred data warehouse using indices. Currently, Hevo supports the following variants:

  • Generic Elasticsearch
  • AWS Elasticsearch

Source Considerations

  • Elasticsearch does not have the capability to expose each document modification. Therefore, to have at least one incrementing column of sortable type, the identity column is used as the tiebreaker if the sortable field is the same for more than one document.

    The _id field created by default is used if none is specified.

    Note: The _id field is used for sorting only in Elasticsearch versions 7.6 and below. For versions above 7.6, you should refer to Elasticsearch’s documentation for the appropriate changes.


Limitations

  • Only Native Realm authentication is supported.

  • Hevo currently does not support deletes. Therefore, any data deleted in the Source may continue to exist in the Destination.

  • Hevo does not support the replication of hidden objects.



See Also


Revision History

Refer to the following table for the list of key updates made to this page:

Date Release Description of Change
Nov-18-2024 NA Renamed section Set up the EC2 instance to Set up the EC2 instance and Whitelist Hevo’s IP addresses and updated it as per the latest Elasticsearch UI.
Mar-05-2024 2.21 Updated the ingestion frequency table in the Data Replication section.
Jan-16-2024 NA Updated section, Source Considerations to add information about _id field being used for sorting only in specific Elasticsearch versions.
Jul-21-2023 NA Updated section, Limitations to add information about Hevo not supporting replication for hidden objects.
Nov-22-2022 NA Updated section, Limitations to add information about Hevo not capturing deletes.
Aug-24-2022 NA Updated sections, Data Replication and Configure Elasticsearch Connection Settings to restructure the content for better understanding and coherence.
Jun-09-2022 NA Added a reference to the Configuration Changes in Elasticsearch page in the Overview section.
Apr-11-2022 1.86 Added a note in the Connection Settings about setting up a reverse proxy server for connecting to an AWS Elasticsearch Source.
Feb-21-2022 1.82 Added section, (Optional) Connect to Elasticsearch hosted inside a Virtual Private Cloud (VPC)
Jan-03-2022 1.79 Updated the description of the Include New Tables in the Pipeline advance setting in the Configure Elasticsearch Connection Settings section.
Jul-26-2021 1.68 Added a note for the Database Host field.
Jul-12-2021 1.67 Added the field Include New Tables in the Pipeline under Source configuration settings.
Jun-01-2021 1.64 Updated the Configure Elasticsearch Connection Settings section to include the Connect Through HTTPS setting.

Tell us what went wrong