Users with a StreamSets
enterprise account can install the Data Collector tarball
and start it as a service for supported operating systems that use the systemd init system.
Supported operating systems include CentOS 7, Oracle Linux 7, Red Hat Enterprise Linux 7, or Ubuntu
16.04 LTS.
For tarball installation instructions for
operating systems that use the SysV init system, see Installing from the Tarball for Systems Using SysV Init.
This
procedure walks through setting the default directories and
the default system user
and group used to start Data Collector as
a service. Before you install, you can alternatively use the
$SDC_DIST/systemd/sdc.service file to modify the environment variables that define directories and the system user
and group.
Note: StreamSets does not
recommend using NFS or NAS to store Data Collector
files.
Installing the full Data Collector as a service requires root privileges.
-
Download the Data Collector full tarball from the StreamSets Support portal.
-
Use the following command to extract the tarball to the desired location,
typically /opt/streamsets-datacollector/:
tar xf streamsets-datacollector-all-<version>.tgz -C <extraction directory>
For example, to extract version
5.4.0, use the following
command:
tar xf streamsets-datacollector-all-5.4.0.tgz -C /opt/streamsets-datacollector
-
Use the following command from the directory where you extracted the tarball to
copy systemd/sdc.service to the
/etc/systemd/system directory:
cp systemd/sdc.service /etc/systemd/system/sdc.service
-
If you did not extract the tarball to the default directory
/opt/streamsets-datacollector/, override the
/etc/systemd/system/sdc.service
file to modify the SDC_HOME
and ExecStart values.
Override the default values in the
sdc.service
file using the same procedure that you use to
override unit configuration files on a systemd init system. For an example, see
"Example 2. Overriding vendor settings" in this systemd.unit manpage.
-
Use the following command from the directory where you extracted the tarball to
copy systemd/sdc.socket to the
/etc/systemd/system directory:
cp systemd/sdc.socket /etc/systemd/system/sdc.socket
-
Optionally, edit the /etc/systemd/system/sdc.socket file
to modify the Data Collector port number. The port must match the one defined in
sdc.properties. Default is 18630.
-
Create a system user and group named sdc.
For example, use the following command to create a system user and group with
the next available group ID and user ID:
groupadd -r sdc && useradd -r -d <installation dir> -g sdc -s /sbin/nologin sdc
If you’re installing Data Collector on multiple machines, we recommend explicitly specifying a group ID and
user ID to ensure that the IDs are consistent across the machines. Use the
-g and -u flags respectively to specify the ID.
-
Use the following command to reload the systemd manager configuration:
-
Use the following command to create the Data Collector configuration directory at /etc/sdc:
-
Use the following command from the directory where you extracted the tarball to
copy all files from etc into the Data Collector configuration directory that you just created:
-
Use the following command to change the owner of the
/etc/sdc directory and all files in the directory to
sdc:sdc:
chown -R sdc:sdc /etc/sdc
-
Use the following commands to create the Data Collector log directory at /var/log/sdc and change the owner to
sdc:sdc:
mkdir /var/log/sdc
chown sdc:sdc /var/log/sdc
-
Use the following commands to create the Data Collector data directory at /var/lib/sdc and change the owner to
sdc:sdc:
mkdir /var/lib/sdc
chown sdc:sdc /var/lib/sdc
-
Use the following commands to create the Data Collector resources directory at /var/lib/sdc-resources and change
the owner to sdc:sdc:
mkdir /var/lib/sdc-resources
chown sdc:sdc /var/lib/sdc-resources
-
Use the following command to start Data Collector as a service:
-
To add the Data Collector service to the system startup, use the
following command:
-
To access the Data Collector UI, enter the following URL
in the address bar of your browser: