Control Hub Configuration File
You can customize how a registered Data Collector works with StreamSets Control Hub by editing the Control Hub configuration file, $SDC_CONF/dpm.properties, located in the Data Collector installation.
Use a text editor to edit the dpm.properties configuration file. To enable the changes, restart Data Collector.
The Control Hub configuration file includes the following general properties:
General Property | Description |
---|---|
dpm.enabled | Specifies whether the Data Collector is
enabled to work with Control Hub.
Default is false. |
dpm.base.url | URL to access Control Hub. |
dpm.registration.retry.attempts | Maximum number of times that Data Collector
attempts to register with Control Hub
before failing the registration. Default is 5. |
dpm.security.validationTokenFrequency.secs | Frequency in seconds that Data Collector
validates authentication and user tokens with Control Hub. Default is 60. |
dpm.appAuthToken | File located within $SDC_CONF , the Data Collector
configuration directory, that includes the
authentication token for this Data Collector
instance.Generally, you should not need to change this value. |
dpm.remote.control.job.labels | Labels to assign to this Data Collector. Use
labels to group Data Collectors
registered with Control Hub. To
assign multiple labels, enter a comma-separated list of labels.
Default is "all", which you can use to run a job on all registered Data Collectors. |
dpm.remote.control.ping.frequency | Frequency in milliseconds that Data Collector
notifies Control Hub that
it is running. Default is 5,000. |
dpm.remote.control.events.recipient | Name of the internal Control Hub
application to which Data Collector sends
pipeline status updates. Do not change this value. |
dpm.remote.control.process.events.recipients | Names of the internal Control Hub
applications to which Data Collector sends
performance updates - including CPU load and memory usage. Do not change this value. |
dpm.remote.control.status.events.interval | Frequency in milliseconds that Data Collector
informs Control Hub of the following information:
Default is 60,000. |
dpm.remote.deployment.id | For provisioned Data Collectors, the
ID of the deployment that provisioned the Data Collector. For manually administered Data Collectors, the value is blank. Do not change this value. |
http.meta.redirect.to.sso | Enables the redirect of Data Collector user
logins to Control Hub
using the HTML meta refresh method. Set to true only if the
registered Data Collector is
installed as on application on Microsoft Azure
HDInsight. Default is false, which means that Data Collector uses HTTP redirect headers to redirect logins. Use the default for all other Data Collector installation types. |
dpm.alias.name.enabled |
Enables using an abbreviated Control Hub user ID when Hadoop impersonation mode or shell impersonation mode are used. By default, when using Hadoop
impersonation mode or shell impersonation mode, a Data Collector
registered with Control Hub
uses the full Control Hub
user ID as the user name, as
follows:
Enable this property to use only the ID, ignoring " To use a partial Control Hub user ID, uncomment the property and set it to true. When using Hadoop impersonation mode, the Hadoop system, Data Collector, and the pipeline stages must be properly configured. For more information, see Hadoop Impersonation Mode. When using shell impersonation mode, Data Collector and the operating system to run the shell script must be properly configured. For more information, see Data Collector Shell Impersonation Mode. |
dpm.runHistory.enabled | Enables storing information about previous pipeline runs in
the data/runHistory folder in the engine
installation directory. This property is used only in IBM StreamSets. |