Creating Another Data Collector Instance

You can create another instance of a Data Collector tarball or RPM installation on the same machine with the create-dc command. The additional Data Collector instance uses the same configuration as the original Data Collector instance. You can modify the configuration properties as needed.

When you want to run another Data Collector instance with the same configuration, using the create-dc command is simpler than downloading and installing another Data Collector. If you install a new instance, you then must manually make the same modifications to the configuration files.
Note: The create-dc command copies the original Data Collector configuration directory inside the base runtime directory of the additional Data Collector instance. However, StreamSets recommends that you use directories outside of the runtime directory to enable use of the directories after Data Collector upgrades. For information on modifying Data Collector directories, see Data Collector Directories.

Data Collector does not need to be running to use the create-dc command. Call the command from the $SDC_DIST directory as follows:

bin/streamsets create-dc -home=<SDC_HOME_DIR> (-httpPort=<SDC_HTTP_PORT> | -httpsPort=<SDC_HTTPS_PORT>) \
[-conf=<SDC_CONF_DIR>]

Use the -help option to view additional information for the command, for example: create-dc -help.

Create-dc Option Description
-home=<SDC_HOME_DIR> Required. Home directory for the additional Data Collector instance.
The root of the specified home directory must exist. However, the home subdirectory cannot exist because the command creates that directory. For example, say you enter the following for the home directory option:
-home="/sdcs/sdc2"

The directory /sdcs must exist and must not contain a subdirectory named /sdc2.

-httpPort=<SDC_HTTP_PORT>

or

-httpsPort=<SDC_HTTPS_PORT>

Required. HTTP or HTTPS port for the additional Data Collector instance. Enter a port number that is not in use.
-conf=<SDC_CONF_DIR> Optional for tarball installations with a manual start. Required for tarball and RPM installations with a service start. The configuration directory of the original Data Collector instance.

For a tarball installation with a manual start, you must use this option if you changed the default location of the Data Collector configuration directory, $SDC_DIST/etc. For example, you must use this option if you followed our recommendation to use a configuration directory outside of the base runtime directory, $SDC_DIST.

For a tarball or RPM installation with a service start, you must use this option. The default location for the Data Collector configuration directory for a service start is /etc/sdc.

Default is $SDC_DIST/etc.

For example, the following command creates an additional Data Collector instance with a home directory of /sdcs/sdc2 and an HTTP port of 19001. The original Data Collector instance is an RPM installation, and so the command specifies the location of the original Data Collector configuration directory as /etc/sdc:
bin/streamsets create-dc -home="/sdcs/sdc2" -httpPort=19001 -conf="/etc/sdc"

The command copies the configuration directory of the original Data Collector instance inside the home directory of the additional Data Collector instance, so that the configuration directory for the additional Data Collector instance is: /sdcs/sdc2/etc/sdc.