The Azure IoT/Event Hub Consumer origin reads data from Microsoft Azure Event Hub. The origin can use multiple threads to enable parallel processing of data from a single Azure event hub.
Before you use the Azure IoT/Event Hub Consumer origin, make sure you have the required Microsoft Azure storage account and container.
When you configure the Azure IoT/Event Hub Consumer, you specify the Microsoft Azure namespace and event hub names. You also define the shared access policy name and connection string key. You specify the consumer group to use and an event processor prefix that the origin uses when communicating with Azure Event Hub.
You configure the storage account details, such as the storage account name and key. And you specify the number of threads to use during processing.
Before you use the Azure IoT/Event Hub Consumer origin, you need a Microsoft Azure storage account and at least one container.
The origin stores offsets in a storage account container, so to ensure the integrity of offset information, you must use a different container for each pipeline that includes an Azure IoT/Event Hub Consumer origin.
For example, say you use the Azure IoT/Event Hub Consumer as the origin for an IoT pipeline and a Transactions pipeline. To keep the offset data for these pipelines separate, you need to use two different storage account containers. They can be in the same storage account or in different storage accounts. When you configure the origins, you specify the storage account and container to use.
If you need to create a storage account, click the Add icon. Enter a name for the storage account, and enter or select a resource group name. You can use the defaults for all other properties.
If these steps are no longer accurate, see the Microsoft Azure Event Hub documentation.
You cannot use Data Collector to reset the origin for Azure IoT/Event Hub Consumer pipelines because the offset is stored in Azure Event Hub.
This can take some time. Allow the portal to complete the removal of the container before continuing.
The Azure IoT/Event Hub Consumer origin performs parallel processing and enables the creation of a multithreaded pipeline.
The Azure IoT/Event Hub Consumer origin uses multiple concurrent threads to read from an event hub based on the Max Threads property. When you start the pipeline, the origin creates the number of threads specified in the Max Threads property. Each thread connects to the origin system and creates a batch of data, and passes the batch to an available pipeline runner.
A pipeline runner is a sourceless pipeline instance - an instance of the pipeline that includes all of the processors and destinations in the pipeline and represents all pipeline processing after the origin. Each pipeline runner processes one batch at a time, just like a pipeline that runs on a single thread. When the flow of data slows, the pipeline runners wait idly until they are needed.
Multithreaded pipelines preserve the order of records within each batch, just like a single-threaded pipeline. But since batches are processed by different pipeline instances, the order that batches are written to destinations is not ensured.
For example, say you set the Max Threads property to 5. When you start the pipeline, the origin creates five threads, and Data Collector creates a matching number of pipeline runners. Upon receiving data, the origin passes a batch to each of the pipeline runners for processing.
Each pipeline runner performs the processing associated with the rest of the pipeline. After a batch is written to pipeline destinations, the pipeline runner becomes available for another batch of data. Each batch is processed and written as quickly as possible, independent from other batches processed by other pipeline runners, so batches may be written differently from the read-order.
At any given moment, the five pipeline runners can each process a batch, so this multithreaded pipeline processes up to five batches at a time. When incoming data slows, the pipeline runners sit idle, available for use as soon as the data flow increases.
For more information about multithreaded pipelines, see Multithreaded Pipeline Overview.