Launching a Transformer Tarball

To use a Transformer tarball when Spark runs locally, complete the following steps on the machine where you want to install and launch Transformer.

  1. Verify that the machine meets all the requirements for a Transformer engine.
  2. Download and install the Java JDK version for the Scala version selected for the engine version.
  3. Open a command prompt and set your file descriptors limit to at least 32768.
  4. Download Apache Spark from the Apache Spark Download page.
    Download a supported Spark version that is valid for the Transformer features that you want to use.
    Make sure that the Spark version is prebuilt with the same Scala version as Transformer. For more information, see Scala Match Requirement.
  5. Open a command prompt and then extract the Apache Spark tarball by running the following command:
    tar xvzf <spark tarball name>

    For example:

    tar xvzf spark-3.5.3-bin-hadoop3.tgz
  6. Run the following command to set the SPARK_HOME environment variable to the directory where you extracted the Apache Spark tarball:
    export SPARK_HOME=<spark path>

    For example:

    export SPARK_HOME=/opt/spark-3.5.3-bin-hadoop3/
  7. Paste and then run the installation script command that you copied from the self-managed deployment. Respond to the command prompts to enter download and installation directories.
    Note: If needed, you can retrieve the generated installation script. You can optionally skip the command prompts by defining the directories as command arguments.
  8. View the engine status in the command prompt or in the Control Hub UI if you chose to check the engine status.
  9. To deploy an additional engine instance for this deployment, simply repeat these steps on another machine.