SDC Edge Communication

StreamSets Control Hub works with Data Collector Edge (SDC Edge) to execute edge pipelines. SDC Edge is a lightweight agent that runs pipelines on edge devices with limited resources.

You install each SDC Edge on an edge device in your corporate network, and then register it to work with Control Hub.

You use an authoring Data Collector to design edge pipelines. You can design edge pipelines in the Control Hub Pipeline Designer after selecting an available authoring Data Collector to use. Or, you can directly log into an authoring Data Collector to design edge pipelines using the Data Collector UI.

To preview and validate edge pipelines as you design them, the authoring Data Collector must connect to a registered SDC Edge. The SDC Edge accepts inbound connections from the authoring Data Collector over HTTP or HTTPS on the port number configured for the SDC Edge.

Registered Edge Data Collectors use encrypted REST APIs to communicate with Control Hub. Edge Data Collectors initiate outbound connections to Control Hub over HTTPS on port number 443.

The following image shows how each SDC Edge communicates with Control Hub and with the authoring Data Collector:

SDC Edge Requests

Just like Data Collector, a registered SDC Edge sends requests and information to Control Hub.

Control Hub does not directly send requests to an SDC Edge. Instead, Control Hub sends requests using encrypted REST APIs to a messaging queue managed by Control Hub. An SDC Edge periodically checks with the queue to retrieve Control Hub requests.

SDC Edge communicates with Control Hub in the following areas:

Jobs
Every minute, an SDC Edge sends a heartbeat, the last-saved offsets, and the status of all remotely running pipelines to Control Hub so that Control Hub can manage job execution.
Note: SDC Edge version 3.13.0 and earlier sends this information to the messaging queue.
Metrics
Every minute, an SDC Edge sends metrics for remotely running edge pipelines directly to Control Hub.
Messaging queue
At startup, an SDC Edge sends the following information to the messaging queue: SDC Edge version, HTTP URL of the SDC Edge, and labels configured in the SDC Edge configuration file, edge.conf.
Every five seconds, each SDC Edge checks with the messaging queue to retrieve requests sent by Control Hub. When you start, stop, or delete a job, Control Hub sends a pipeline request for a specific SDC Edge to the messaging queue. The messaging queue retains the request until the receiving SDC Edge retrieves the request.