Hadoop Impersonation Mode

When the Hadoop YARN cluster is configured for impersonation but not for Kerberos authentication, you can configure the Hadoop impersonation mode that Transformer uses when performing tasks in the Hadoop system.  

When not using Kerberos, Transformer impersonates Hadoop users as follows:
  • As the user defined in the pipeline properties - When configured, Transformer uses the specified Hadoop user to launch the Spark application and to access files in the Hadoop system.
  • As the currently logged in Transformer user who starts the pipeline - When no Hadoop user is defined in the pipeline properties, Transformer uses the user who starts the pipeline.
Important: When Kerberos authentication is enabled, Transformer impersonates Hadoop users as the Transformer user who starts the pipeline, or runs directly as the Kerberos principal defined for the pipeline. When Kerberos is enabled, Transformer ignores the Hadoop user defined in the pipeline properties.

The system administrator can configure Transformer to always use the user who starts the pipeline by enabling the hadoop.always.impersonate.current.user property in the Transformer configuration fileconfiguration properties. When enabled, configuring a Hadoop user within a pipeline is not allowed.

Configure Transformer to always impersonate as the user who starts the pipeline when you want to prevent access to data in Hadoop systems by the pipeline-level property.

For example, say you use roles, groups, and pipeline permissions to ensure that only authorized operators can start pipelines. You expect that the operator user accounts are used to access all external systems. But a pipeline developer can specify an HDFS user in a pipeline and bypass your attempts at security. To close this loophole, configure Transformer to always use the user who starts the pipeline to read from or write to Hadoop systems.

To always use the user who starts the pipeline, in the Transformer configuration fileconfiguration properties, uncomment the hadoop.always.impersonate.current.user property and set it to true.