Control Hub Configuration

You can edit Control Hub configuration files to configure properties such as the host name and port number and SMTP account information for emails. You can also customize Control Hub to display your company logo instead of the StreamSets logo in the user interface.

Control Hub configuration files are included in the $DPM_CONF directory. View the comments in the file for a description of each property. If you modify a property in a configuration file, restart Control Hub for the changes to take effect.

Important: Instead of entering sensitive data such as passwords in clear text in the configuration files, you can protect the sensitive data by storing the data in an external location and then using functions to retrieve the data.

The following table lists each configuration file:

File Name Description
basic-realm.properties Configures users that can log in to the Control Hub Admin tool, as described in Configuring Users.
common-to-all-apps.properties Properties that are common to all Control Hub applications, including the Control Hub base URL, load balancer URL, SMTP account properties to enable Control Hub to send email, and system Data Collector configuration.
connection-app.properties Properties for the Connection application.
dpm-log4j2.properties Log configuration properties, including the log level.
dpm.properties Properties for Control Hub, including properties to enable HTTPS and to display hosted help.
dynamic_preview-app.properties Properties for the Dynamic Preview application.
jobrunner-app.properties Properties for the Job Runner application.
messaging-app.properties Properties for the Messaging application.
notification-app.properties Properties for the Notification application.
pipelinestore-app.properties Properties for the Pipeline Store application, including properties to configure the organization that manages system sample pipelines.
policy-app.properties Properties for the Policy application.
provisioning-app.properties Properties for the Provisioning application.
reporting-app.properties Properties for the Reporting application.
scheduler-app.properties Properties for the Scheduler application.
sdp_classification-app.properties Properties for the Classification application.
security-app.properties Properties for the Security application, including properties to configure LDAP authentication.
sla-app.properties Properties for the SLA application.
timeseries-app.properties Properties for the Time Series application.
topology-app.properties Properties for the Topology application.

Email Properties

The common-to-all-apps.properties file includes the following properties for using an SMTP server to send email.

Note: In a development environment, you can choose not to use an SMTP server and instead configure Control Hub to use the user ID for each user’s initial password.
Email Property Description
mail.transport.protocol Use smtp or smtps.

Default is smtp.

mail.smtp.host SMTP host name.

Default is localhost.

mail.smtp.port SMTP port number.

Default is 25.

mail.smtp.auth Whether the SMTP host uses authentication. Use true or false.

Default is false.

mail.smtp.starttls.enable Whether the SMTP host uses STARTTLS encryption. Use true or false.

Default is false.

mail.smtps.host SMTPS host name.

Default is localhost.

mail.smtps.port SMTPS port number.

Default is 465.

mail.smtps.auth Whether the SMTPS host uses authentication. Use true or false.

Default is false.

xmail.username User name for the email account to send email.
xmail.password Password for the email account. To protect the password, store the password in an external location and then use a function to retrieve the password.
xmail.from.address Email address to use to send email.

Initial Passwords without an SMTP Server

When using Control Hub in a production environment, set up an SMTP server so that Control Hub can send automated emails to users. For example, when you create a new user when Control Hub is configured to use an SMTP server, Control Hub sends the new user an initial welcome email with instructions to log in and set the initial password.

When trying out Control Hub in a development environment, you can choose not to use an SMTP server. In this situation, Control Hub cannot send emails to new users instructing them to set their initial password. You must set the passwordHandler.userIdAsPasswordReset property in the security-app.properties file to true so that each user’s initial password is the same as their user ID.

After a new user’s initial login using the user ID as the password, they must reset the password.

Protecting Sensitive Data in Configuration Files

You can protect sensitive data in Control Hub configuration files by storing the data in an external location and then using the file or exec function to retrieve the data.

Some Control Hub configuration file properties, such as the https.keystore.password property in the $DPM_CONF/dpm.properties file, require that you enter a password. Instead of entering the password in clear text in a configuration file, you can store the password outside of the configuration file and then use the file or exec function to retrieve the sensitive data.

You can use functions to retrieve sensitive data in the following ways:
From a file
Store the sensitive data in a separate file and then use the file function in the configuration file to retrieve the data as follows:
${file("<filename>")}
For example, if you configure the https.keystore.password property as follows, Control Hub retrieves the password from the keystore_pwd.txt file:
https.keystore.password=${file("keystore_pwd.txt")}
Retrieving sensitive data from another file provides some level of security. However, the sensitive data in the additional file is still entered in clear text and thus vulnerable for others to access. For increased security, use a script or executable to retrieve the sensitive data.
Using a script or executable
For increased security, develop a script or executable that retrieves the sensitive data from an external location. For example, you can develop a script that decrypts an encrypted file containing a password. Or you can develop a script that calls an external REST API to retrieve a password from a remote vault system.
Use the exec function in the configuration file to call the script or executable as follows:
${exec("<script name>")} 
For example, if you configure the https.keystore.password property as follows, Control Hub runs the keystore_pwd.sh script to retrieve the password:
https.keystore.password=${exec("keystore_pwd.sh")}

When you use either the file or the exec function, Control Hub uses the exact output of the file or script. So if the output produces a password and then a newline character, Control Hub uses the value with the newline character. This causes Control Hub to use a password that is not valid. Carefully design and test how you define the output of the file or script to ensure that the functions return only the expected sensitive data.

Retrieving Sensitive Data from Files

Use the file function in configuration files to retrieve sensitive data from a local file stored in the Control Hub configuration directory.

You can store a single piece of information in a file and must save the file in the Control Hub configuration directory, $DPM_CONF. When Control Hub starts, it retrieves the sensitive data from the referenced files.

  1. Create a text file for each configuration value that you want to safeguard.
  2. Include only one configuration value in each file.
    You can store any configuration value in a separate file, but do not include more than one configuration value in a file. Ensure that the file does not include extra characters, such as a newline character, after the sensitive data.
  3. Save the file in the Control Hub configuration directory, $DPM_CONF.
  4. In the configuration file, set the relevant value to the file function and the appropriate file name. Use the required syntax as follows:
    ${file("<filename>")}

Retrieving Sensitive Data Using Scripts

Use the exec function in configuration files to call a script or executable that retrieves sensitive data from an external location.

You must save the script on the local machine where Control Hub runs. When Control Hub starts, it runs the script to retrieve the sensitive data.

  1. Develop a script or executable to retrieve each configuration value that you want to safeguard.
    Ensure that the script or executable does not include extra characters, such as a newline character, after the sensitive data.
  2. Save the script or executable on the local machine where Control Hub runs.
  3. In the configuration file, set the relevant value to the exec function and use the script or executable file name for the argument. Use the required syntax as follows:
    ${exec("<script name>")}
    If you save the script in the Control Hub configuration directory, $DPM_CONF, enter just the script name for the argument, for example:
    ${exec("database_pwd.sh")}
    If you save the script outside of the Control Hub configuration directory, enter an absolute path for the script name, for example:
    ${exec("/tmp/database_pwd.sh")}
    Important: Enter only the script or executable file name as the function argument. You cannot include parameters for the script within the argument. For example, ${exec("database_pwd.sh -name jobrunner")} is not a valid argument. If the script or executable requires parameters, design a wrapper script to call the original script with the corresponding parameters and then call the wrapper script from the exec function.

Using Hosted Help

By default, Control Hub uses the installed help project. You can configure Control Hub to use the help project hosted on the StreamSets website.

Hosted help contains the latest available documentation and requires an internet connection. Both help projects provide context-sensitive help.

The following lines in the dpm.properties file configure Control Hub to use the locally installed help project by default:
ui.doc.help.url=docs/
#ui.doc.help.url=https://streamsets.com/documentation/controlhub/latest/help/
To configure Control Hub to use the hosted help, comment the first definition of the ui.doc.help.url property and uncomment the second definition of the property. Then, in the URL defined for the second definition, specify the version of Control Hub you are using and change help in the URL to onpremhelp, as follows:
#ui.doc.help.url=docs/
ui.doc.help.url=https://streamsets.com/documentation/controlhub/<version>/onpremhelp/
For example, to use the help hosted on the website for Control Hub on-premises version 3.57.x, modify the lines as follows:
#ui.doc.help.url=docs/
ui.doc.help.url=https://streamsets.com/documentation/controlhub/3.57.x/onpremhelp/

Restart Control Hub for the changes to take effect.

Customizing the StreamSets Logo

You can customize the StreamSets logo that displays in the top toolbar of Control Hub.

For example, you can replace the StreamSets logo:

With your company logo:

To customize the logo, simply overwrite the following file with your custom logo file:

$DPM_HOME/dpm-static-web/assets/images/logo.png

The change takes effect immediately - you do not need to restart Control Hub to see the customized logo.

Always Migrate Job Offsets

By default when you stop and then restart a job that is disabled for failover, Control Hub sends the last-saved offset to the same Data Collector that originally ran the pipeline. You can configure Control Hub to always send job offsets to different Data Collectors with matching labels when you restart a job.

The always.migrate.offsets property in the $DPM_CONF/jobrunner-app.properties file determines whether Control Hub always migrates job offsets to different Data Collectors when you stop and restart a job.

If set to true, Control Hub always sends the offset and pipeline instance to a different Data Collector when you restart the job. Set to true when all of your jobs include a pipeline origin that is not tied to a particular Data Collector machine. For example, if a pipeline reads from an external system such as a relational database or Elasticsearch, any Data Collector within the same network and with an identical configuration can continue processing from the last-saved offset recorded by another Data Collector.

If set to false, Control Hub determines the Data Collector to use on restart based on whether failover is enabled for the job:
  • Failover is disabled - Control Hub sends the offset to the same Data Collector that originally ran the pipeline instance. In other words, Control Hub associates each pipeline instance with the same Data Collector.
  • Failover is enabled - Control Hub sends the offset to another available Data Collector assigned all labels specified for the job.

Set to false when most of your jobs include a pipeline origin that is tied to a particular Data Collector machine. For example, if your jobs include a Directory or File Tail origin that reads from a local directory on the Data Collector machine.

Default is false.