Firewall Configuration for IBM StreamSets as a Service
Applies to: IBM StreamSets as a Service
If you access IBM StreamSets from machines that reside behind a firewall or in a system that limits access to specific DNS names and IP addresses, you must allow the required inbound and outbound traffic to each machine.
The requirements differ, based on whether the machines are used to launch a web browser to access the Control Hub UI or are used to run engines.
Browser
When you use a web browser to access the Control Hub UI from a machine that resides behind a firewall or in a system that limits access to specific IP addresses, ensure that your firewall allows outbound connectivity to the following systems.
System | DNS and IP Address | Port | Protocol | Usage |
---|---|---|---|---|
IBM StreamSets authentication service | cloud.login.streamsets.com Allow all of the following:
|
443 | TCP, TLS 1.2 or later | User authentication. |
IBM StreamSets identity provider | identitytoolkit.googleapis.com | 443 | TCP, TLS 1.2 or later | Identity management for username/password and social
logins. SAML logins use the IBM StreamSets authentication service. |
Control Hub | Allow all of the following:
|
443 | TCP, TLS 1.2 or later | Web browser access to the Control Hub UI. |
Deployed engines | Location where engines are running | HTTPS port defined in the engine advanced configuration properties of the deployment | TCP, TLS 1.2 | When using direct engine REST APIs for browser to engine
communication, web browsers must be able to directly reach
engines. In most cases, you can use the default WebSocket tunneling communication method and do not need to allow outbound connections from browser machines to the engine machines. For more information, see Engine Communication. |
Engines
When you deploy engines to on-premise or cloud computing machines that reside behind a firewall or in a system that limits access to specific IP addresses, allow the required inbound and outbound traffic to each machine.
Inbound Connections
Control Hub does not directly send requests to machines running engines. However, you must configure your firewall to allow the following inbound connections to the machines, depending on the engine type and configuration:
Engine Type | Port | Protocol | Usage |
---|---|---|---|
Transformer | Transformer port - 19630 by default | TCP | The Apache Spark cluster must be able to access Transformer at this port number to send the status, metrics, and offsets for running pipelines. |
All | HTTPS port defined in the engine advanced configuration properties of the deployment | TCP | When using direct engine REST APIs for browser to engine
communication, web browsers must be able to reach engines on the
configured HTTPS port number. In most cases, you can use the default WebSocket tunneling communication method and do not need to allow an inbound connection to the HTTPS port number. For more information, see Engine Communication. |
In addition, if you want to use SSH to connect to machines running engines, configure your firewall to allow the following inbound connection to the machines. Control Hub does not require SSH access to the machines. However, you might want to enable access for troubleshooting purposes.
Engine Type | Port | Protocol | Usage |
---|---|---|---|
All | 22 | TCP | Optionally connect to the machine using SSH. |
Outbound Connections
Engines make outbound connections to the following systems. Ensure that your firewall allows outbound connectivity to these systems.
System | DNS and IP Address | Port | Protocol | Usage |
---|---|---|---|---|
Control Hub | Allow all of the following:
|
443 | TCP, TLS 1.2 or later | Engine communication with Control Hub. |
IBM StreamSets server that hosts engine installation and stage library files | archives.streamsets.com | 443 | TCP, TLS 1.2 | Engine and stage library file downloads. |
IBM StreamSets telemetry server | Allow all of the following:
|
443 | HTTPS | Telemetry data collection. |
External origin and destination systems | Depends on the system | Depends on the system | Depends on the system | External system connections so that pipeline stages can process your data. |
System | DNS | Port | Protocol | Usage |
---|---|---|---|---|
Docker Hub | Allow all of the following:
|
443 | TCP, TLS 1.2 or later | Pull engine images from Docker Hub. |