Azure Event Hubs

The Azure Event Hubs destination writes data to a single event hub in Microsoft Azure Event Hubs.

Each record is written to the event hub as an event. Events are written to available partitions using a round-robin distribution pattern.

When you configure the destination, you specify the event hub to use and connection information for the event hub. You also specify the data format to use.

Before you use the Azure Event Hubs destination, complete the prerequisite tasks.

Prerequisites

Complete the following prerequisites, as needed, before you configure the Azure Event Hubs destination.

  1. Authorize access to the event hub using shared access signatures.

    The Azure Event Hubs destination requires read and write access to the event hub. For information about assigning access to Azure Event Hubs resources, see the Azure documentation.

    The destination does not support access through Active Directory at this time.

  2. Retrieve the Azure Event Hubs connection string.
    When you configure the Azure Event Hubs destination, you must provide the namespace, shared access policy, and shared access key. These details are included on the Azure Event Hubs connection string, as follows:
    Endpoint=sb://<namespace>.servicebus.windows.net/;SharedAccessKeyName=<shared access policy>;SharedAccessKey=<shared access key>

    For information about retrieving the connection string, see the Azure documentation.

Data Formats

The Azure Event Hubs destination writes records based on the specified data format.

The destination can write using the following data formats:
Avro
The destination writes records based on the Avro schema.
Note: To use the Avro data format, Apache Spark version 2.4 or later must be installed on the Transformer machine and on each node in the cluster.
You can use one of the following methods to specify the location of the Avro schema definition:
  • In Pipeline Configuration - Use the schema defined in the stage properties. Optionally, you can configure the destination to register the specified schema with Confluent Schema Registry at a URL with a schema subject.
  • Confluent Schema Registry - Retrieve the schema from Confluent Schema Registry. Confluent Schema Registry is a distributed storage layer for Avro schemas. You specify the URL to Confluent Schema Registry and whether to look up the schema by the schema ID or subject.

You can also compress data with an Avro-supported compression codec.

Delimited
The destination writes a delimited message for every record. You can specify a custom delimiter, quote, and escape character to use in the data.
JSON
The destination writes a JSON line message for every record. For more information, see the JSON Lines website.
Text
The destination writes a message with a single String field for every record. When you configure the destination, you select the field to use.

Configuring an Azure Event Hubs Destination

Configure an Azure Event Hubs destination to write messages to a Microsoft Azure event hub. Before you configure the destination, complete the prerequisite tasks.
  1. On the Properties panel, on the General tab, configure the following properties:
    General Property Description
    Name Stage name.
    Description Optional description.
  2. On the Event Hub tab, configure the following properties:
    Event Hub Property Description
    Namespace Name Name of the namespace that contains the event hub.

    For help with retrieving this information, see Prerequisites.

    Event Hub Name Event hub to write to.
    Shared Access Policy Name Name of the shared access policy associated with the namespace.

    When appropriate, you can use the default shared access key policy, RootManageSharedAccessKey.

    For help with retrieving this information, see Prerequisites.

    Shared Access Key One of the shared access keys associated with the specified shared access policy.

    For help with retrieving this information, see Prerequisites.

  3. On the Data Format tab, configure the following properties:
    Data Format Property Description
    Data Format Format of the data to write to messages. Select one of the following formats:
    • Avro
    • Delimited
    • JSON
    • Text
  4. For Avro data, click the Schema tab and configure the following properties:
    Schema Property Description
    Avro Schema Location Location of the Avro schema definition to use to process data:
    • In Pipeline Configuration - Use the schema specified in the Avro Schema property.
    • Confluent Schema Registry - Retrieve the schema from Confluent Schema Registry.
    Avro Schema Avro schema definition used to write the data.

    You can optionally use the runtime:loadResource function to use a schema definition stored in a runtime resource file.

    Available when Avro Schema Location is set to In Pipeline Configuration.

    Register Schema Registers the specified Avro schema with Confluent Schema Registry.

    Available when Avro Schema Location is set to In Pipeline Configuration.

    Schema Registry URLs Confluent Schema Registry URLs used to look up the schema. To add a URL, click Add. Use the following format to enter the URL:
    http://<host name>:<port number>

    Available when Avro Schema Location is set to In Pipeline Configuration.

    Basic Auth User Info Confluent Schema Registry basic.auth.user.info credential.

    Available when Avro Schema Location is set to Confluent Schema Registry.

    Lookup Schema By Method used to look up the schema in Confluent Schema Registry:
    • Subject - Look up the specified Avro schema subject.
    • Schema ID - Look up the specified Avro schema ID.

    Available when Avro Schema Location is set to In Pipeline Configuration.

    Schema Subject Avro schema subject to look up or to register in Confluent Schema Registry.

    If the specified subject to look up has multiple schema versions, the destination uses the latest schema version for that subject. To use an older version, find the corresponding schema ID, and then set the Look Up Schema By property to Schema ID.

    Available when Avro Schema Location is set to Confluent Schema Registry.

    Schema ID Avro schema ID to look up in the Confluent Schema Registry.

    Available when Avro Schema Location is set to In Pipeline Configuration.

    Avro Compression Codec Avro compression type to use.
  5. For delimited data, on the Data Format tab, configure the following property:
    Delimited Property Description
    Delimiter Character Delimiter character to use in the data. Select one of the available options or select Other to enter a custom character.

    You can enter a Unicode control character using the format \uNNNN, where ​ N is a hexadecimal digit from the numbers 0-9 or the letters A-F. For example, enter \u0000 to use the null character as the delimiter or \u2028 to use a line separator as the delimiter.

    Quote Character Quote character to use in the data.
    Escape Character Escape character to use in the data
  6. For text data, on the Data Format tab, configure the following property:
    Text Property Description
    Text Field String field in the record that contains the data to be written. All data must be incorporated into the specified field.