Accessing Log File Information

Transformer provides access to the following log files:
Transformer log
The Transformer log, $TRANSFORMER_LOG/transformer.log, provides information about the Transformer application, such as start-up messages, user logins, or pipeline display in the canvas. You can open the log file on the Transformer machine. You can also view and download log data with the Control Hub UI. When needed, you can also modify the level of detail included in the Transformer log.
The Transformer log can include information about local pipelines or cluster pipelines run on Hadoop YARN in client deployment mode. For these types of pipelines, the Spark driver program is launched on the local Transformer machine. As a result, some pipeline processing messages are included in the Transformer log.
Spark driver log
A Spark driver log provides information about how Spark runs, previews, and validates pipelines.
By default, messages in the Spark driver log are logged at the ERROR severity level. To modify the log level, change the Log Level property on the Cluster tab for the pipeline.
You can view and download the Spark driver log for the following types of pipelines:
  • Local pipelines
  • Cluster pipelines run in Spark standalone mode
  • Cluster pipelines run on Amazon EMR
  • Cluster pipelines run on Hadoop YARN in client deployment mode
For local pipelines or cluster pipelines run on Hadoop YARN in client deployment mode, you can also open the Spark driver log file written to the following location on the Transformer machine for each pipeline: $TRANSFORMER_DATA/runInfo/<pipelineID>/run<timestamp>/driver-all.log

For all other cluster pipelines, the Spark driver program is launched remotely on one of the worker nodes inside the cluster. To view the Spark driver logs for these pipelines, access the Spark web UI for the application launched for the pipeline.

When available, links to the Spark driver log and Spark web UI appear in the Runtime Statistics information that displays when you monitor a draft run of the pipeline.

Transformer Log Format

Transformer uses the Apache Log4j library to write log data. Each log entry includes a timestamp and message along with additional information relevant for the message.

In Control Hub, log entries include the following information:

  • Timestamp
  • Pipeline
  • Severity
  • Message
  • Category
  • User
  • Runner
  • Thread
In the downloaded log file, the log entry has the same information, presented in a different order, as well as the stage that encountered the message for the following pipeline types:
  • Local pipelines
  • Cluster pipelines run on Hadoop YARN in client deployment mode
The downloaded log file does not include stage information for other types of cluster pipelines.

The information included in the downloaded file is defined by the appender.streamsets.layout.pattern in the log configuration file, $TRANSFORMER_CONF/transformer-log4j2.properties.

To customize the log format, see the Log4j documentation. Transformer provides the following custom objects:
  • %X{s-entity} - Local pipeline name and ID
  • %X{s-runner} - Runner ID
  • %X{s-stage} - Stage name
  • %X{s-user} - User who initiated the operation

Viewing the Transformer Log

You can view the Transformer log, $TRANSFORMER_LOG/transformer.log on the Transformer machine.

You can also view and download Transformer log data from Control Hub. When you download log data, you can select the file to download.

  1. In Control Hub, click Dashboards > Topologies Dashboard > Transformers.
  2. Click the Expand icon for the Transformer logs that you want to view.
  3. In the engine details, click View Engine Configuration, then click the Logs tab.
  4. To view earlier events, click Load Previous Logs.
  5. To download the latest log file, click Download. To download a specific log file, click Download > <file name>.
    The most recent information is in the file with the highest number.
  6. To modify the log level, click Log Config.
    For more information, see Modifying the Transformer Log Level.
  7. To refresh the page for the latest log entries, click Refresh.

Modifying the Transformer Log Level

If the Transformer log does not provide enough troubleshooting information, you can modify the log level to display messages at another severity level.

By default, Transformer logs messages at the INFO severity level. You can specify the following log levels:
  • TRACE
  • DEBUG
  • INFO (Default)
  • WARN
  • ERROR
  • FATAL
You can modify the log level through Control Hub UI or by editing the log configuration file, $TRANSFORMER_CONF/transformer-log4j2.properties.
  1. To modify the log level through Control Hub, click Dashboards > Topologies Dashboard > Transformers.
  2. Click the Expand icon for the Transformer logs that you want to view.
  3. In the engine details, click View Engine Configuration, then click the Logs tab.
  4. Click Log Config.
    The contents of the log configuration file, $TRANSFORMER_CONF/transformer-log4j2.properties, displays.
  5. Change the default value of INFO for the following line in the file:
    logger.l1.level=INFO

    For example, to set the log level to DEBUG, modify the line as follows:

    logger.l1.level=DEBUG
  6. Click Save.
    The changes that you make to the log level take effect immediately - you do not need to restart Transformer.

When you’ve finished troubleshooting, set the log level back to INFO to avoid having verbose log files.