Monitoring Transformers

When you view registered Transformers in the Execute view, you can monitor the performance of each Transformer and the pipelines currently running on each Transformer.

You can view configuration properties, all active Java threads, metric charts, logs, and directories for each Transformer. You can also generate a support bundle with the information required to troubleshoot various issues with the engine.

To monitor a Transformer, simply expand the Transformer details in the Execute > Transformers view.

To display most monitoring details, the web browser must be able to access the execution Transformer.

Tip: You can also monitor Transformers by creating a subscription that automatically notifies you when a registered Transformer stops responding.

Performance

When you view the details of a Transformer in the Execute view, you can monitor the performance of the Transformer.

Control Hub displays the following performance information for Transformers:
CPU Load
Percentage of CPU being used by the Transformer.
Memory Used
Amount of memory being used by the Transformer out of the total amount of memory allocated to that Transformer.
For example, let's say that a Transformer displays the following value for Memory Used:
216.36 MB of 1038.88 MB
That means that the Transformer is using 216.36 MB out of the total 1038.88 MB of memory allocated to that Transformer in the Java heap size. You configure the Transformer Java heap size in the TRANSFORMER_JAVA_OPTS environment variable.

You can sort the list of Transformers by the CPU load or by the memory usage so that you can easily determine which Transformers are using the most resources.

You can also analyze historical time series charts for the CPU load and memory usage. For example, you can view the performance information for the last hour or for the last seven days. The following image displays the location where you select a time period for analysis of the charts:

By default, registered Transformers send the CPU load and memory usage to Control Hub every minute. You can change the frequency with which each Transformer sends this information to Control Hub by modifying the dpm.remote.control.status.events.interval property in the Control Hub configuration file, $TRANSFORMER_CONF/dpm.properties.

Configuration Properties

When you view the details of a Transformer in the Execute view, click View Engine Configuration to view the Transformer configuration properties.

The Configuration tab displays a read-only view of the properties in all Transformer configuration files. Transformer configuration files include the $TRANSFORMER_CONF/transformer.properties file and the following additional files:
  • dpm.properties
  • vault.properties
  • credential-stores.properties

You can enter text in the Filter field to filter the properties. For example, you can enter dpm to display the properties included in the dpm.properties file.

To edit the properties, edit the configuration files.

Metrics

When you view the details of a Transformer in the Execute view, click View Engine Metrics to view metric charts for the Transformer.

Metric charts include CPU usage, threads, and heap memory usage.

Thread Dump

When you view the details of a Transformer in the Execute view, click View Thread Dump to view all active Java threads used by the Transformer.

You can sort the list of threads by each column and refresh the list of threads. You can also enter text in the Filter field to filter the results. For example, you can filter the results by thread name or status.

When you expand a thread, Control Hub displays the stack trace for that thread.

Support Bundles

When you view the details of a Transformer in the Execute view, click View Support Bundle to generate a support bundle.

A support bundle is a ZIP file that includes Transformer logs, environment and configuration information, pipeline JSON files, resource files, and other details to help troubleshoot issues. You upload the generated file to a StreamSets Support ticket, and the Support team can use the information to help resolve your tickets. Alternatively, you can send the file to another StreamSets community member.

Support bundles work the same for Data Collector and Transformer. For more information about generating a support bundle, including the generators used to create the bundle and how to customize the generators, see Support Bundles in the Data Collector documentation.

Viewing Logs

You can view and download the Transformer log, $TRANSFORMER_LOG/transformer.log, when you monitor a Transformer. The Transformer log provides information about the Transformer engine, such as start-up messages, user logins, or pipeline display in the canvas.

Tip: When monitoring a job, you can also view the Spark driver log which provides information about how Spark runs, previews, and validates pipelines.

For information about the log format and how to modify the log level, see Log Format in the Transformer documentation.

  1. View the details of a Transformer in the Execute view, click View Engine Configuration, and then click the Logs tab to view log data.
    The Logs tab displays roughly 50,000 characters of the most recent log information.
  2. To filter the messages by log level, select a level from the Severity list.

    By default, the log displays messages for all severity levels.

  3. To view earlier messages, click Load Previous Logs.
  4. To download the latest log file, click Download. To download a specific log file, click Download > <file name>.
    The most recent information is in the file with the highest number.

Directories

You can view the directories that each Transformer uses. You might check the directories being used to access a file in the directory or to increase the amount of available space for a directory.

When you view the details of a Transformer in the Execute view, click View Engine Configuration, and then click the Directories tab to view the directories.

Transformer directories are defined in environment variables. For more information, see Customization with Environment Variables in the Transformer documentation.

The following table describes the Transformer directories that display:

Directory Includes Environment Variable
Runtime Base directory for Transformer executables and related files. TRANSFORMER_DIST
Configuration The Transformer configuration file, transformer.properties, and related realm properties files and keystore files.

Also includes the Log4j properties file.

TRANSFORMER_CONF
Data Pipeline configuration and run details. TRANSFORMER_DATA
Log Transformer log file, transformer.log. TRANSFORMER_LOG
Resources Directory for runtime resource files. TRANSFORMER_RESOURCES
DT Libraries Extra Directory Directory to store external libraries. STREAMSETS_LIBRARIES_EXTRA_DIR

Pipeline Status

When you view the details of a Transformer in the Execute view, Control Hub displays the list of pipelines currently running on this Transformer.

Control Hub can display the following types of running pipelines for each Transformer:

Local pipelines
A local pipeline is a test run of a draft pipeline or is a pipeline that is managed by a Transformer and run locally on that Transformer.
Control Hub controlled pipelines
A Control Hub controlled pipeline is a pipeline that is managed by Control Hub and run remotely on registered Transformer. Control Hub controlled pipelines are published pipelines run from Control Hub jobs.
After you publish or import pipelines to Control Hub, you add them to a job, and then start the job. When you start a job on a Transformer, Control Hub remotely runs an instance of the published pipeline on the Transformer. Use Control Hub to start, stop, and monitor published pipelines that are run from jobs.