Enabling HTTPS
- Data Collector
- Enable HTTPS for Data Collector to secure the communication to the Data Collector UI and REST API and to use the Data Collector as an authoring Data Collector in Control Hub.
- Cluster pipelines
- If you run cluster pipelines, enable HTTPS for cluster pipelines to secure the communication between the gateway and worker nodes in the cluster.
- Pipeline stages that connect to external systems
- During pipeline development, developers can enable specific stages to use SSL/TLS to secure the communication with an external system. For example, if designing a pipeline that writes to a Cassandra cluster enabled for HTTPS, the developer must configure the Cassandra destination to use SSL/TLS to connect to Cassandra.
By default, Data Collector and cluster pipelines use the HTTP protocol. StreamSets recommends using HTTPS in a production environment. HTTPS requires SSL/TLS certificates.
By default, the Control Hub web browser uses WebSocket tunneling to communicate with deployed Data Collectors. WebSocket tunneling ensures that your data is secure and does not require additional setup.
However, when you preview a pipeline or capture a snapshot of an active job, your source data does pass through encrypted connections beyond your corporate network into Control Hub, and then back to your web browser. If your data must remain behind a firewall due to corporate regulations, you can configure the browser to use direct engine REST APIs to directly communicate with the engines behind the firewall. For more information, see Engine Communication in the Control Hub documentation.
When using direct engine REST APIs, you must enable Data Collector to use the HTTPS protocol.