Installing when Spark Runs Locally
To get started with Transformer in a development environment, install both Transformer and Spark on the same machine. This allows you to easily develop and test local pipelines, which run on the local Spark installation.
All users can install Transformer from a tarball and run it manually. Users with an enterprise account can install Transformer from an RPM package and run it as a service. Installing an RPM package requires root privileges.
transformer
. If a transformer
user and a
transformer
group do not exist on the machine, the installation
creates the user and group for you and assigns them the next available user ID and group
ID.transformer
user
and group, create the user and group before installation and specify the IDs that
you want to use. For example, if you're installing Transformer on
multiple machines, you might want to create the system user and group before
installation to ensure that the user ID and group ID are consistent across the
machines.Before you start, ensure that the machine meets all installation requirementsself-managed deployment and general installation requirements and choose the engine versioninstallation package that you want to use.
-
Download the Transformer installation package from one of the following locations:
- StreamSets Support portal if you have an enterprise account.
- StreamSets website if you do not have an enterprise account.
If using the RPM package, download the appropriate package for your operating system:
- For CentOS 6, Oracle Linux 6, or Red Hat Enterprise Linux 6, download the RPM EL6 package.
- For CentOS 7, Oracle Linux 7, or Red Hat Enterprise Linux 7, download the RPM EL7 package.
-
If you downloaded the tarball, use the following command to extract the tarball
to the desired location:
tar xf streamsets-transformer-all_<scala version>-<transformer version>.tgz -C <extraction directory>
For example, to extract Transformer version 4.1.0 prebuilt with Scala 2.11.x, use the following command:
tar xf streamsets-transformer-all_2.11-4.1.0.tgz -C /opt/streamsets-transformer/
-
If you downloaded the RPM package, complete the following steps to extract and
install the package:
-
Download Apache Spark from the Apache Spark Download page to the same machine as the Transformer installation.
Download a supported Spark version that is valid for the Transformer features that you want to use.
- Extract the downloaded Spark file.
-
Add the SPARK_HOME environment variable to the Transformer environment configuration file to define the Spark installation path on the
Transformer machine.
Modify environment variables using the method required by your installation type.
Set the environment variable as follows:export SPARK_HOME=<Spark path>
For example:export SPARK_HOME=/opt/spark-2.4.0-bin-hadoop2.7/