Enabling HTTPS

By default, the Control Hub web browser uses WebSocket tunneling to communicate with deployed Transformers. WebSocket tunneling ensures that your data is secure and does not require additional setup.

However, when you preview a pipeline or capture a snapshot of an active job, your source data does pass through encrypted connections beyond your corporate network into Control Hub, and then back to your web browser. If your data must remain behind a firewall due to corporate regulations, you can configure the browser to use direct engine REST APIs to directly communicate with the engines behind the firewall. For more information, see Engine Communication in the Control Hub documentation.

When using direct engine REST APIs, you must enable Transformer to use the HTTPS protocol.

Prerequisites

Before you enable HTTPS for Transformer, complete the following requirements:

Obtain access to OpenSSL and Java keytool
If you do not have a keystore file that includes an SSL/TLS certificate signed by a certificate authority (CA), you can request a certificate and create the keystore file using the following tools:
  • OpenSSL - Use OpenSSL to create a Certificate Signing Request (CSR) that you send to the CA of your choice, as well as to create the keystore and truststore files. For more information, see the OpenSSL documentation.
  • Java keytool - You can also use Java keytool to create a CSR and to create keystore and truststore files. Java keytool is part of the Java Development Kit (JDK). For more information, see the keytool documentation.
Generate the SSL/TLS certificate and private key pair signed by a certificate authority (CA)
To enable HTTPS for Transformer, generate a private key and public certificate pair for Transformer. Transformer provides a self-signed certificate that you can use. However, web browsers generally issue a warning for self-signed certificates. StreamSets strongly recommends that you generate a key and certificate pair signed by a CA.
Important: The signed certificate must include the fully qualified domain name (FQDN) for the Transformer machine.
To obtain a certificate from a trusted CA, you must provide proof that you are the owner of the domain name for which you are requesting the certificate. Use OpenSSL or keytool to generate a key pair and then submit a Certificate Signing Request (CSR) to the CA. The exact procedure depends on the CA that you choose to use - see the documentation provided by the CA.

Create a Keystore File

Create a keystore file that includes the private key and public certificate pair signed by the CA. A keystore is used to verify the identity of the client upon a request from an SSL/TLS server.

StreamSets recommends using a certificate signed by a trusted CA. If the certificate is not signed by a trusted CA, such as a self-signed certificate, you must also add the certificate to the truststore.

StreamSets also recommends creating keystores in the PKCS #12 (p12 file) format. In most cases, a CA issues certificates in PEM format. Use OpenSSL to directly import the certificate into a PKCS #12 keystore.

  1. Use the following command to import the certificate and private key issued in PEM format to a PKCS #12 keystore for Transformer:
    openssl pkcs12 -export -in <PEM certificate> -inkey <private key> -out <keystore filename> -name <keystore name> 

    You will be prompted to create a password for the keystore file.

    For example, the following command converts the certificate tx_company_com.pem and private key tx_company_com.key to the PKCS #12 keystore file named tx_company_com.p12:
    openssl pkcs12 -export -in tx_company_com.pem -inkey tx_company_com.key -out tx_company_com.p12 -name tx_company_com
  2. Store the keystore password in a password text file named keystore-password.txt.
    Tip: To ensure that a newline character is not added after the password, run the following command:
    echo -n "<password>" > keystore-password.txt
  3. Store the Transformer keystore file and password text file in the Transformer resources directory, <installation_dir>/externalResources/resources.

Create a Truststore File

A truststore file contains certificates from trusted CAs that an SSL/TLS client uses to verify the identity of an SSL/TLS server. Transformer uses the default Java truststore file located in $JAVA_HOME/jre/lib/security/cacerts.

When Transformer is enabled for HTTPS and you run a cluster pipeline that launches a Spark application, the default Java truststore file is included with the application. When the Spark application sends status and metrics about running pipelines to Transformer, the HTTPS certificates must be trusted by the default Java truststore.

When Transformer runs pipelines on a Spark cluster and the Transformer HTTPS certificates are signed by a private CA or not trusted by the default Java truststore, you must create a custom truststore file or modify a copy of the default Java truststore file. For example, if your organization generates its own certificates, you must add the root and intermediate certificates for your organization to the truststore file.

You do not need to create a truststore file and can skip this step in the following situations:
  • Transformer runs only local pipelines.
  • Transformer runs pipelines on a Spark cluster and your certificates are signed by a trusted CA included in the default Java truststore file.

These steps show how to modify a copy of the default truststore file to add an additional CA to the list of trusted CAs. If you prefer to create a custom truststore file, see the keytool documentation.

You can create the following types of truststores for Transformer:
  • Java keystore file (JKS)
  • PKCS #12 (p12 file)
  1. On the Transformer machine, use the following command to set the JAVA_HOME environment variable:
    export JAVA_HOME=<Java home directory>
  2. Use the following command to set the TRANSFORMER_RESOURCES environment variable:
    export TRANSFORMER_RESOURCES=<Transformer resources directory>
    For example:
    export TRANSFORMER_RESOURCES=streamsets-transformer-4.1.0/externalResources/resources
  3. Use the following command to copy the default Java truststore file to the Transformer resources directory:
    cp "${JAVA_HOME}/jre/lib/security/cacerts" "${TRANSFORMER_RESOURCES}/truststore.jks"
  4. Use the following keytool command to import the CA certificate into the truststore file:
    keytool -import -file <CA certificate> -trustcacerts -noprompt -alias <CA alias> -storepass <password> -keystore "${TRANSFORMER_RESOURCES}/truststore.jks"
    For example:
    keytool -import -file  tx_company_com.pem -trustcacerts -noprompt -alias MyCorporateCA -storepass changeit -keystore "${TRANSFORMER_RESOURCES}/truststore.jks"
  5. Store the truststore password in a password text file named truststore-password.txt.
    Tip: To ensure that a newline character is not added after the password, run the following command:
    echo -n "<password>" > truststore-password.txt
  6. Store the Transformer truststore file and password text file in the Transformer resources directory, <installation_dir>/externalResources/resources.

Configure Transformer to Use HTTPS

Modify the Transformer configuration properties to configure Transformer to use a secure port and your keystore file. If you created a custom truststore file or modified a copy of the default Java truststore file, configure Transformer to use that truststore file.

  1. When using one of the cloud service provider deployments, such as an Amazon EC2 or a Google Compute Engine (GCE) deployment, locate the public IP address of the provisioned instance.
    1. Launch the deployment to provision the instance.
    2. Use the console for your cloud service provider to locate the provisioned instance.
    3. Copy the public IP address of the instance.
  2. In Control Hub, edit the deployment. In the Configure Engine section, click Advanced Configuration. Then, click Transformer Configuration.
  3. Configure the following properties:
    Transformer HTTPS Property Description
    https.port Secure port number for Transformer. For example, 19636.

    Any number besides -1 enables the secure port number.

    Note: When both the HTTP and HTTPS port properties are defined, the HTTP port bounces to the HTTPS port.
    transformer.base.http.url Transformer URL using the HTTPS protocol and the secure port number configured in the https.port property.

    For a cloud service provider deployment, use the public IP address that you copied from the cloud service provider console. For example:

    transformer.base.http.url=https://<IP address>:19636

    For a self-managed deployment where Transformer runs on a local on-premises machine, you might use the name of the host machine. For example:

    transformer.base.http.url=https://myhost:19636

    Important: For a self-managed deployment where Transformer runs on a cloud computing machine, use the public IP address of that instance.

    The specifed URL can also act as a default cluster callback URL. For more information, see Understanding the Spark Cluster Callback URL.

    Be sure to uncomment the property.

    https.keystore.path

    Path and name of the keystore file. Enter an absolute path or a path relative to the Transformer resources directory.

    For example, to use a keystore file named tx_company_com.p12 stored in the resources directory, configure the property as follows:

    https.keystore.path=tx_company_com.p12

    Note: Default is keystore.jks which provides a self-signed certificate that you can use. However, StreamSets strongly recommends that you generate a certificate signed by a trusted CA, as described in Prerequisites.
    https.keystore.password Password to open the keystore file.

    For example, if you added the password to a text file named keystore-password.txt and stored the file in the Transformer resources directory, configure the property as follows:

    https.keystore.password=${file("keystore-password.txt")}

    https.truststore.path Name of the truststore file.

    If you created a custom truststore file or modified a copy of the default Java truststore file, uncomment this property and enter an absolute path or a path relative to the Transformer resources directory.

    For example, to use a truststore file named truststore.jks stored in the resources directory, configure the property as follows:

    https.truststore.path=truststore.jks

    If you do not uncomment and configure the property, Transformer uses the default Java truststore file located in $JAVA_HOME/jre/lib/security/cacerts.

    https.truststore.password Password to open the truststore file.

    Uncomment this property to specify the location of the password.

    For example, if you added the password to a text file named truststore-password.txt and stored the file in the Transformer resources directory, configure the property as follows:

    https.truststore.password=${file("truststore-password.txt")}

  4. Save the changes to the deployment and restart all engine instances.