Transformations

The data received from Sources might not be suitable to be stored in your destination as is. Hevo allows you to transform your data before it is replicated in your destination through Transformations.

These are the common use cases for transforming your data:

  • Cleansing - Your data may contain unclean values or values in inconsistent formats. You may want to convert all booleans to 0s and 1s or trim special characters or spaces from values. All these operations can be done in transformations
  • Data Enrichment - Data can be enriched from static sources. e.g. IPs can be converted to cities or zip codes, coordinates can be converted to geo-hashes etc. You may also enrich your data from your business specific static data-sets.
  • Re-expression - You can re-express your data into different units or formats, for example, converting all currency values to USD or changing all dates to a particular time zone
  • Filtering - You can filter events that you don’t want to replicate to your warehouse based on complex logic.
  • Normalization - Have nested event structure coming from a NoSQL Source like MongoDB? You can create normalized schemas from a single event through transformations.

How does it work?

Hevo runs your Transformations Code over each event that is received through the pipelines. To perform any transformations on your events, you can change properties of the event object received in the transform method as a parameter.

def transform(event):
    return event

The event object has various convenience methods to help you write your transformations. Please refer to API Reference - Event Object in Transformations for more details. The properties of event object can be modified to your liking - fields can be added, modified, or removed.

The environment has all the standard python libraries enabled for you to write your transformations.

Below is an example of a more complex transformation:

from ua_parser import user_agent_parser
import datetime

def transform(event):
  event_name = event.getEventName()
  properties = event.getProperties()

  # Concatenate First Name and Last Name of the User to create Full Name
  if event_name == 'users':
    properties['full_name'] = properties['first_name'] + ' ' +  properties['last_name']

  # Parse the User Agent string
  if properties.get('user_agent'):
    user_agent = properties.get('user_agent')
    parsed_string = user_agent_parser.ParseUserAgent(user_agent)
    properties['user_agent'] = parsed_string.get('family')  # Add a timestamp at which the event was streamed
  properties['streamed_ts'] = datetime.datetime.now()
  return event

Testing your code

Once you write your Transformations code or make changes to the existing code you can test it against real sample events received by Hevo from your sources. You can also test your code against events from a specific pipeline or a specific event type. You can even modify the sample events to test your code for any edge case scenarios.

Deploying your code

Once you are satisfied with your code and have tested it, press Deploy. Once you Deploy your code it will apply to all incoming events. We would advise you to monitor the Failed Events for any failures due to the newly deployed code.

Last updated on 26 Aug 2020