Register Transformer

To register Transformer with Control Hub, you generate an authentication token and modify the Transformer configuration files.

The method you use to register Transformer depends on the Transformer installation type:
Tarball installation
You can register Transformer from the command line interface or from Control Hub.
RPM installation
You must register Transformer from Control Hub.

A registered Transformer communicates with Control Hub at regular intervals. If Transformer cannot connect to Control Hub, due to a network or system outage, then Transformer uses the Control Hub disconnected mode.

Before you register Transformer, ensure that you have enabled HTTPS for Transformer.

Registering from the Command Line Interface

For a Transformer tarball installation, you can register Transformer with Control Hub using the command line interface.

Note: For a Transformer RPM installation, you must use Control Hub to register Transformer.

When you register Transformer from the command line interface, Transformer generates the authentication token and modifies the configuration files for you. Transformer must be running before you can use the command line interface.

Start Transformer, and then use the system enableDPM command to register Transformer.

Use the command from the $TRANSFORMER_DIST directory as follows:

bin/streamsets cli \
(-U <sdcURL> | --url <sdcURL>) \
[(-D <dpmURL> | --dpmURL <dpmURL>)] \
[(-a <sdcAuthType> | --auth-type <sdcAuthType>)] \
[(-u <sdcUser> | --user <sdcUser>)] \
[(-p <sdcPassword> | --password <sdcPassword>)] \
system enableDPM \
(--dpmUrl <dpmBaseURL>) \
(--dpmUser <dpmUserID>) \
(--dpmPassword <dpmUserPassword>) \
[(--labels <labels>)]

When using the system enableDPM command, the following basic options are required:

Basic Option Description
-U <sdcURL>

or

--url <sdcURL>
Required. URL of the Transformer.

The default URL is http://localhost:19630.

-D <dpmURL>

or

--dpmURL <dpmURL>

Required. URL to access Control Hub:
  • For Control Hub cloud, set to https://cloud.streamsets.com.
  • For Control Hub on-premises, set to the Control Hub URL provided by your system administrator. For example, https://<hostname>:18631.

The following table describes the enableDPM options:

Enable DPM Option Description
--dpmUrl <dpmBaseURL> URL to access Control Hub:
  • For Control Hub cloud, set to https://cloud.streamsets.com.
  • For Control Hub on-premises, set to the Control Hub URL provided by your system administrator. For example, https://<hostname>:18631.
--dpmUser <dpmUserID> Required. Enter your Control Hub user ID using the following format:
<ID>@<organization ID>
--dpmPassword <dpmUserPassword> Required. Enter the password for your Control Hub user account.
--labels <labels> Required. Assign a label to this Transformer. You can enter multiple labels separated by commas. Labels that you assign here are defined in the Control Hub configuration file, $TRANSFORMER_CONF/dpm.properties. To remove these labels after you register the Transformer, you must modify the configuration file.
For example, the following command registers a Transformer with Control Hub and assigns three labels to the Transformer:
bin/streamsets cli -U http://localhost:19630 -D https://cloud.streamsets.com system enableDPM --dpmUrl https://cloud.streamsets.com --dpmUser alison@MyOrg --dpmPassword MyPassword --labels Finance,Accounting,Development

Restart Transformer to apply the changes.

Using a Publicly Accessible URL

If you register Transformer when it is installed on a cloud computing platform such as Amazon Elastic Compute Cloud (EC2), configure Transformer to use a publicly accessible URL.

When you register Transformer with Control Hub, Transformer sends its URL to Control Hub in the format http://<hostname>:<http.port>, where <hostname> is the value defined in the http.bindHost property in the Transformer configuration file, $TRANSFORMER_CONF/transformer.properties. If the host name is not defined in http.bindHost, Transformer runs the following command to determine the host name: hostname -f

For most cloud computing platforms, the hostname -f command returns the private IP address of the machine on the cloud platform. Control Hub includes the private IP address in the Transformer URL displayed in Control Hub. However, when you click the Transformer URL, you cannot access Transformer because you must use a public IP address to access a cloud machine.

To access Transformer installed on a cloud computing platform from Control Hub, uncomment the transformer.base.http.url property in the Transformer configuration file, $TRANSFORMER_CONF/transformer.properties, and then configure it to use the publicly accessible URL to that Transformer.

After modifying the configuration file, restart Transformer for the changes to take effect.

Using a Proxy Server

You can configure each registered Transformer to use an authenticated HTTP or HTTPS proxy server for outbound requests made to Control Hub. Define the proxy properties in the TRANSFORMER_JAVA_OPTS environment variable.

Modify environment variables using the method required by your installation type.

Add the following Java options to the TRANSFORMER_JAVA_OPTS environment variable:

  • https.proxyUser
  • https.proxyPassword
  • https.proxyHost
  • https.proxyPort

If the proxy server uses HTTP instead of HTTPS, use http.<property name> for each property.

For example, to configure a registered Transformer to use an HTTPS proxy server on host 138.0.0.1 and port 3138, define TRANSFORMER_JAVA_OPTS as follows:

export TRANSFORMER_JAVA_OPTS="${TRANSFORMER_JAVA_OPTS} -Xmx1024m -Xms1024m -Dhttps.proxyUser=MyName -Dhttps.proxyPassword=MyPsswrd -Dhttps.proxyHost=138.0.0.1 -Dhttps.proxyPort=3138 -server" 
Note: Oracle JDK disabled HTTP proxy authentication for HTTPS URLs in JDK 8 update 111. If Transformer runs on a machine with Java 8u111 or later, consider using an HTTPS proxy server. Or as a workaround, consider adding the following Java property to the TRANSFORMER_JAVA_OPTS environment variable, setting the property to an empty string:
-Djdk.http.auth.tunneling.disabledSchemes=''

However, use this workaround with caution since it exposes credentials by sending them through an unencrypted proxy.