Firewall Configuration for IBM StreamSets as Client-Managed Software

Applies to: IBM StreamSets as client-managed software

If you access IBM StreamSets from machines that reside behind a firewall or in a system that limits access to specific DNS names and IP addresses, you must allow the required inbound and outbound traffic to each machine.

The requirements differ, based on whether the machines are used to launch a web browser to access the Control Hub UI or are used to run engines.

Browser

When you use a web browser to access the Control Hub UI from a machine that resides behind a firewall or in a system that limits access to specific IP addresses, ensure that your firewall allows outbound connectivity to the following systems:

System DNS and IP Address Port Protocol Usage
Deployed engines Location where engines are running HTTPS port defined in the engine advanced configuration properties of the deployment TCP, TLS 1.2 When using direct engine REST APIs for browser to engine communication, web browsers must be able to directly reach engines.

In most cases, you can use the default WebSocket tunneling communication method and do not need to allow outbound connections from browser machines to the engine machines. For more information, see Engine Communication.

Engines

When you deploy Data Collector engines to on-premise or cloud computing machines that reside behind a firewall or in a system that limits access to specific IP addresses, allow the required inbound and outbound traffic to each machine.

Inbound Connections

Control Hub does not directly send requests to machines running Data Collector engines. However, you must configure your firewall to allow the following inbound connections to the machines, depending on the engine configuration:

Port Protocol Usage
HTTPS port defined in the engine advanced configuration properties of the deployment TCP When using direct engine REST APIs for browser to engine communication, web browsers must be able to reach engines on the configured HTTPS port number.

In most cases, you can use the default WebSocket tunneling communication method and do not need to allow an inbound connection to the HTTPS port number. For more information, see Engine Communication.

In addition, if you want to use SSH to connect to machines running Data Collector engines, configure your firewall to allow the following inbound connection to the machines. Control Hub does not require SSH access to the machines. However, you might want to enable access for troubleshooting purposes.

Port Protocol Usage
22 TCP Optionally connect to the machine using SSH.

Outbound Connections

Data Collector engines make outbound connections to the following systems. Ensure that your firewall allows outbound connectivity to these systems.

System DNS and IP Address Port Protocol Usage
External origin and destination systems Depends on the system Depends on the system Depends on the system External system connections so that pipeline stages can process your data.