Credential Stores

You can configure Transformer to access sensitive information that is secured in a credential store.

Transformer pipelines communicate with external systems to perform tasks such as launching a Spark application, or reading and writing data. Most of these external systems require sensitive information, such as user names or passwords, to access the system. When you configure pipeline stages for these external systems, you must specify the details that the stages need to connect to the system.

If you enter sensitive information directly in stage and pipeline properties, you expose those details to any user with access to the pipeline. To access external systems without exposing sensitive details, add them as secrets to a credential store and then use StreamSets credential functions in stage and pipeline properties to retrieve those values.

Defining secrets in a credential store can make it easier to migrate pipelines to another environment. For example, if you migrate multiple pipelines from a development to a production environment, you do not need to edit each pipeline with details for the production environment. You can simply replace the development credential store with the production version.

You can configure Transformer to use multiple credential stores. Each credential store is identified by a unique credential store ID.

You can use the following credential stores with Transformer:

Enabling Credential Stores

You can configure Transformer to use one or more credential stores. Each credential store is identified by a unique credential store ID.

You specify the credential stores that Transformer can use in the $TRANSFORMER_CONF/credential-stores.properties file. The file includes the following information:
credentialStores property
This property defines the credential stores that Transformer can use.
By default, the property is commented out and includes a default credential store ID for each of the supported credential store types, such as aws for AWS Secrets Manager and azure for Azure Key Vault.
To enable using credential stores, you uncomment this property and enter a comma-separated list of the credential store IDs to use.
You can specify multiple credential stores of the same type or of different types, such as two AWS Secret Managers and one Java Keystore. You simply specify a unique ID for each credential store.
usePortableGroups property
This property allows you to migrate pipelines that access a credential store from one Control Hub organization to another without updating the pipeline.
Important: Use this property only when recommended by the StreamSets Support team.
To call secrets from a pipeline, you use credential functions and include a group argument in the expression. When working with Control Hub, you define the group argument as follows: <group ID>@<organization ID>. When the usePortableGroups property is enabled, Transformer does not evaluate the <organization ID> portion of the group argument. This allows you to migrate pipelines from one organization to another without editing credential functions in pipelines, as long as the new organization has matching group names with the same credential store access.
For example, when the usePortableGroups property is enabled, the group argument dev@mycompany in a credential function is read as dev. So if you migrate the pipeline to a different organization that also has a dev group with the same credential store access, the pipeline can be used without updates.
By default, the property is commented out and set to false. When recommended by the StreamSets Support team, you can enable the property by uncommenting the property and setting it to true.
Sets of related properties
Each supported credential store type has a set of related properties. The property names include the default credential store IDs originally specified in the credentialStores property.
For example, the AWS Secrets Manager properties include aws, the default Secrets Manager ID, in each Secrets Manager property name, such as credentialStore.aws.config.region and credentialStore.aws.config.access.key.
When you use a custom credential store ID, you must update all related property names to match the custom ID. For example, if you want to use awsUS as a custom ID, you must update all Secrets Manager default property names for the awsUS credential store replacing aws with awsUS.
Note: When you want to use multiple credential stores of the same type, you must have a set of related store properties that are renamed and defined appropriately for each credential store.

For example, say you want to use two Azure credential stores, azureDev for development and azureProd for production. To do this, you specify the credential store IDs in the credentialStores property and make a copy of the related Azure credential store properties, so you have one set for each credential store.

Then, you rename and configure the properties for azureDev, and you do the same for azureProd. The resulting properties might look as follows, with important changes highlighted:
################################################
#        Transformer Credential Stores         #
################################################

credentialStores=azureDev,azureProd

#credentialStores.usePortableGroups=false

############################################################
# azureDev: Azure Key Vault Credential Store Configuration #
############################################################

credentialStore.azureDev.def=streamsets-transformer-azure-keyvault-credentialstore-lib::com_streamsets_datacollector_credential_azure_keyvault_AzureKeyVaultCredentialStore
credentialStore.azureDev.config.credential.refresh.millis=30000
credentialStore.azureDev.config.credential.retry.millis=15000
credentialStore.azureDev.config.vault.url=https://development.vault.azure.net/
credentialStore.azureDev.config.client.id=devClientID
credentialStore.azureDev.config.client.key=devClientKey
credentialStore.azureDev.config.enforceEntryGroup=false

#############################################################
# azureProd: Azure Key Vault Credential Store Configuration #
#############################################################

credentialStore.azureProd.def=streamsets-transformer-azure-keyvault-credentialstore-lib::com_streamsets_datacollector_credential_azure_keyvault_AzureKeyVaultCredentialStore
credentialStore.azureProd.config.credential.refresh.millis=30000
credentialStore.azureProd.config.credential.retry.millis=15000
credentialStore.azureProd.config.vault.url=https://production.vault.azure.net/
credentialStore.azureProd.config.client.id=prodClientID
credentialStore.azureProd.config.client.key=prodClientKey
credentialStore.azureProd.config.enforceEntryGroup=false

Group Access to Secrets

As an additional layer of security, you can employ user groups to further limit access to the secrets defined in credential stores.

Transformer provides two methods to limit access with user groups:
Required group argument in credential functions
Credential functions include a group argument that defines the group that can access the secret. The group argument ensures that the user who attempts to preview, validate, or start a pipeline that includes a credential function belongs to the group specified in the function. The user must also have execute permission on the pipeline.
When working only with Transformer, simply specify the group name, such as devops. When working with Control Hub, specify the group argument as follows: <group ID>@<organization ID>. For example, devops@MyCompany.
If you do not want to restrict access to a secret, specify the default all group when working only with Transformer. When working with Control Hub and a Transformer version earlier than 3.16.0, you must use the default all@<organization ID> group. When working with Control Hub and Transformer version 3.16.0 or later, you can specify the default group using all or all@<organization ID>. If you use the all group instead of all@<organization ID>, you do not need to modify credential functions when migrating pipelines from Transformer to Control Hub.
If Transformer shuts down while running a pipeline that uses a credential function, Transformer restarts the pipeline without checking the group access.
Optional group secrets in the credential store

In addition to using the group argument in credential functions, you can configure Transformer to require group secrets for a credential store.

To require the use of group secrets, in the $TRANSFORMER_CONF/credential-stores.properties file, set the credentialStore.<cstore ID>.config.enforceEntryGroup property to true.

A group secret is a secret defined in the credential store that contains a comma-delimited list of Transformer user groups permitted to access the associated secret.

When the credential store ID requires group secrets, you must define a group secret for every secret that Transformer accesses in that credential store. The name of the group secret is based on the secret name, as follows:
<secret name>-groups
When you configure a credential function to call a secret, the user group specified in the credential function must be listed in the associated group secret that is defined in the credential store.
For example, say you you work with Control Hub and enable Transformer to require group secrets for Azure Key Vault. Then, in a Azure Event Hubs origin, you use the following expression to retrieve a shared access key from the azure credential store:
${credential:get("azure", "production@MyCompany", sharedAccessKey)}
When you run the pipeline, Transformer validates all of the following:
  • The user who starts the pipeline is in the production user group.
  • The sharedAccessKey secret has an associated sharedAccessKey-groups secret defined in the credential store.
  • The sharedAccessKey-groups secret includes the production user group.

When Transformer is not configured to require group secrets, Transformer validates only the first point, verifying that the user belongs to the specified group.

AWS Secrets Manager

To use the AWS Secrets Manager credential store system, install the AWS Secrets Manager credential store stage library and define the configuration properties used to connect to Secrets Manager. Then, use credential functions in pipeline stage properties to retrieve stored values.

In Secrets Manager, you must configure an access and secret key pair with correct permission to read the key. To follow best practices, make secrets read-only and limit access. See the Secrets Manager documentation on identity and access management (IAM) policies.

Note: This documentation includes Secrets Manager information needed for the configuration process. For more information, see the AWS Secrets Manager documentation.

Step 1. Configure the Credential Store Properties

To enable Transformer to connect to the AWS Secrets Manager credential store, configure the Secrets Manager properties in the $TRANSFORMER_CONF/credential-stores.properties file.

  1. Uncomment the credentialStores property in the file and specify the credential store ID to use. Use only alphabetic characters for the credential store ID.

    By default, the property lists a default credential store ID for each type of credential store, aws for AWS Secrets Manager, azure for Azure Key Vault, and so on. When using one credential store of any type, it's simplest to use the default value.

    To use just a single Secrets Manager, set the value to aws.

    To enable multiple credential stores, specify a comma-separated list of credential store IDs. For example, to use a Java keystore and a Secrets Manager credential store, set the value to jks,aws. To use multiple Secrets Manager credential stores, simply specify separate IDs for each, such as awsDev,awsProd.

  2. Uncomment and configure the following properties as needed.

    If you specified a custom credential store ID, update the names of the following properties, and then configure them as needed. When using the default credential store ID, aws , leave the property names intact, and simply configure the properties.

    To use multiple AWS Secrets Manager credential stores, make a copy of the properties for each credential store. Then, update the credential store ID in each set of property names before defining the properties. For an example, see Enabling Credential Stores.

    Important: Instead of entering sensitive data such as passwords in clear text in the configuration file, you can protect the sensitive data by storing the data in an external location and then using functions to retrieve the data.

    These properties are grouped in the AWS Secrets Manager section of the file:

    Secrets Manager Property Description
    credentialStore.<cstore ID>.def Required. Defines the implementation of the AWS Secrets Manager credential store.

    Do not change the default value.

    credentialStore.<cstore ID>.config.nameKey.separator Optional. Separator to use in the name argument for credential functions.
    Note: In Secrets Manager, names can contain alphanumeric and the following special characters: / _ + = . @ - . Therefore, avoid using those characters as separators.
    credentialStore.<cstore ID>.config.region Required. AWS region that hosts Secrets Manager. For a list of available regions, see the AWS Region Table.
    credentialStore.<cstore ID>.config.security.method Required. Authentication method used to connect to AWS. Set to one of the following values:
    • instanceProfile - Authenticates using an instance profile associated with Transformer.

      Use when Transformer runs on an Amazon EC2 instance that has an associated instance profile. Transformer uses the instance profile credentials to automatically authenticate with AWS.

    • accessKeys - Authenticates using an AWS access key pair.

      Use when Transformer does not run on an Amazon EC2 instance or when the EC2 instance doesn’t have an instance profile.

    credentialStore.<cstore ID>.config.access.key Required when using access keys to authenticate with AWS. AWS access key.
    credentialStore.<cstore ID>.config.secret.key Required when using access keys to authenticate with AWS. AWS secret key.
    credentialStore.<cstore ID>.config.cache.max.size Optional. Maximum number of secrets Transformer can cache locally. Default is 1024.
    credentialStore.<cstore ID>.config.cache.ttl.millis Optional. Number of milliseconds that Transformer considers a cached secret valid before requiring a refresh. Default is 1 hour.
    credentialStore.<cstore ID>.config.enforceEntryGroup Optional. Requires Transformer to verify if the user who previews, validates, or starts the pipeline belongs to a group that is permitted to access the secret.

    When set to true, each secret must have a corresponding <secret name>-groups secret that contains a comma-separated list of groups that is permitted to access the secret.

    For more information, see Group Access to Secrets.

    Default is false.

  3. Restart Transformer to enable the changes.

Step 2. Call Secrets from the Pipeline

Specify credential functions in stage or pipeline properties to retrieve secrets stored in AWS Secrets Manager.

Use the credential functions in any stage property that displays the key icon next to it. For example:

Important: When you use a credential function in a stage or pipeline property, the function must be the only value defined in the property.

For details about credential functions, see Credential Functions.

Azure Key Vault

Before Transformer can connect to the Microsoft Azure Key Vault credential store system, you must complete several prerequisites in Azure so that Transformer can access the Azure Key Vault as an application.

After completing the prerequisites, install the Azure Key Vault credential store stage library and define the configuration properties used to connect to Azure Key Vault. Then, define credential functions in stage or pipeline properties to retrieve stored values.

Note: This documentation includes details about Azure Key Vault to simplify the configuration process. For more information, see the Azure Key Vault documentation.

Prerequisites

Before Transformer can connect to the Microsoft Azure Key Vault credential store system, you must complete the following prerequisites within Azure:

Register Transformer with Azure Active Directory
Use the Azure portal to register Transformer as an application in Azure Active Directory. When an application such as Transformer accesses secrets in an Azure key vault, the application must use an authentication token from Azure Active Directory.
The registration process assigns Transformer the following values, which you will specify when you configure the credential store properties:
  • application ID
  • authentication key
For more information about registering applications in Azure Active Directory, see the Azure Key Vault documentation.
Authorize Transformer to use keys or secrets in the Azure key vault
Use the Azure portal to authorize Transformer to use the keys or secrets in the Azure key vault. Azure Key Vault requires that applications be authorized to access each key vault.
For information about authorizing applications to use keys or secrets, see the Azure Key Vault documentation.

Step 1. Configure the Credential Store Properties

To enable Transformer to connect to the Azure Key Vault credential store, configure the Azure Key Vault properties in the $TRANSFORMER_CONF/credential-stores.properties file.

  1. Uncomment the credentialStores property in the file and specify the credential store ID to use. Use only alphabetic characters for the credential store ID.

    By default, the property lists a default credential store ID for each type of credential store, aws for AWS Secrets Manager, azure for Azure Key Vault, and so on. When using one credential store of any type, it's simplest to use the default value.

    To use just a single Azure Key Vault, set the value to azure.

    To enable multiple credential stores, specify a comma-separated list of credential store IDs. For example, to use a Java keystore and an Azure Key Vault credential store, set the value to jks,azure. To use multiple Azure Key Vault credential stores, simply specify separate IDs for each, such as azureDev,azureProd.

  2. Uncomment and configure the following properties as needed.

    If you specified a custom credential store ID, update the names of the following properties, and then configure them as needed. When using the default credential store ID, azure, leave the property names intact, and simply configure the properties.

    To use multiple Azure Key Vault credential stores, make a copy of the properties for each credential store. Then, update the credential store ID in each set of property names before defining the properties. For an example, see Enabling Credential Stores.

    Important: Instead of entering sensitive data such as passwords in clear text in the configuration file, you can protect the sensitive data by storing the data in an external location and then using functions to retrieve the data.

    The properties are grouped in the Azure Key Vault section of the file:

    Azure Key Vault Property Description
    credentialStore.<cstore ID>.def Required. Defines the implementation of the Azure Key Vault credential store.

    Do not change the default value.

    credentialStore.<cstore ID>.config.credential.refresh.millis Optional. Number of milliseconds that Transformer locally caches a secret. When the time expires, Transformer retrieves the secret from Azure Key Vault.
    credentialStore.<cstore ID>.config.credential.retry.millis Optional. Number of milliseconds that Transformer waits before attempting to retry a retrieval of a secret from Azure Key Vault, in the case of an error.
    credentialStore.<cstore ID>.config.vault.url Required. URL to the key vault created in Azure Key Vault.

    Use the following format:

    https://<key vault name>.vault.azure.net/
    credentialStore.<cstore ID>.config.client.id Required. Application ID assigned to this Transformer when you registered Transformer as an application in Azure Active Directory, as described in Prerequisites.
    credentialStore.<cstore ID>.config.client.key Required. Authentication key assigned to this Transformer when you registered Transformer as an application in Azure Active Directory, as described in Prerequisites.
    credentialStore.<cstore ID>.config.enforceEntryGroup Optional. Requires Transformer to verify if the user who previews, validates, or starts the pipeline belongs to a group that is permitted to access the secret.

    When set to true, each secret must have a corresponding <secret name>-groups secret that contains a comma-separated list of groups that is permitted to access the secret.

    For more information, see Group Access to Secrets.

    Default is false.

  3. Restart Transformer to enable the changes.

Step 2. Call Secrets from the Pipeline

Specify credential functions in stage or pipeline properties to retrieve secrets stored in Azure Key Vault.

You can configure credential functions in any property that displays the key icon next to it. For example:

Important: When you use a credential function in a stage or pipeline property, the function must be the only value defined in the property.

For details about credential functions, see Credential Functions.

CyberArk

To use the CyberArk credential store system, install the CyberArk Credential Store stage library and define the configuration properties used to connect to CyberArk Application Identity Manager. Then, use credential functions in pipeline stage properties to retrieve stored values.

At this time, CyberArk integration is only supported using web services to connect to the CyberArk Central Credential Provider.
Note: This documentation includes details about CyberArk to simplify the configuration process. For more information, see the CyberArk documentation.

Step 1. Configure the Credential Store Properties

To enable Transformer to connect to the CyberArk credential store, configure the CyberArk properties in the $TRANSFORMER_CONF/credential-stores.properties file.

  1. Uncomment the credentialStores property in the file and specify the credential store ID to use. Use only alphabetic characters for the credential store ID.

    By default, the property lists a default credential store ID for each type of credential store, aws for AWS Secrets Manager, azure for Azure Key Vault, and so on. When using one credential store of any type, it's simplest to use the default value.

    To use just a single CyberArk credential store, set the value to cyberark.

    To enable multiple credential stores, specify a comma-separated list of credential store IDs. For example, to use a Java keystore and a CyberArk credential store, set the value to jks,cyberark. To use multiple CyberArk credential stores, simply specify separate IDs for each, such as cyberarkDev,cyberarkProd.

  2. Uncomment and configure the following properties as needed.

    If you specified a custom credential store ID, update the names of the following properties, and then configure them as needed. When using the default credential store ID, cyberark, leave the property names intact, and simply configure the properties.

    To use multiple CyberArk credential stores, make a copy of the properties for each credential store. Then, update the credential store ID in each set of property names before defining the properties. For an example, see Enabling Credential Stores.
    Important: Instead of entering sensitive data such as passwords in clear text in the configuration file, you can protect the sensitive data by storing the data in an external location and then using functions to retrieve the data.

    These properties are grouped in the CyberArk section of the file:

    CyberArk Property Description
    credentialStore.<cstore ID>.def Required. Defines the implementation of the CyberArk credential store.

    Do not change the default value.

    credentialStore.<cstore ID>.config.credential.refresh.millis Optional. Number of milliseconds that Transformer locally caches a credential. When the time expires, Transformer retrieves the credential from CyberArk.
    credentialStore.<cstore ID>.config.credential.retry.millis Optional. Number of milliseconds that Transformer waits before attempting to retry a retrieval of a credential from CyberArk, in the case of an error.
    credentialStore.<cstore ID>.config.connector Optional. Connector type to CyberArk. Leave the default, webservices, since only web services is currently supported.
    credentialStore.<cstore ID>.config.ws.url Required. CyberArk Central Credential Provider web service URL.

    Use the following format:

    https://<host name>:<port>/AIMWebService/api/Accounts
    credentialStore.<cstore ID>.config.ws.appId Required. CyberArk application ID for this Transformer. You must create the application ID in CyberArk.
    credentialStore.<cstore ID>.config.ws.maxConcurrentConnections Optional. Maximum number of concurrent web service calls that Transformer can make to CyberArk.
    credentialStore.<cstore ID>.config.ws.validateAfterInactivity.millis Optional. Number of milliseconds of inactivity before Transformer validates the HTTP connection to CyberArk.
    credentialStore.<cstore ID>.config.ws.connectionTimeout.millis Optional. Number of milliseconds to wait for a connection to CyberArk.
    credentialStore.<cstore ID>.config.ws.nameSeparator Optional. Separator to use in the name argument that credential functions use.
    Use the following format for the name argument:
    <safe><separator><folder><separator><object name><separator><element name>
    For example, if you keep the default ampersand (&), the format for the name argument is:
    <safe>&<folder>&<object name>&<element name>
    credentialStore.<cstore ID>.config.ws.http.authentication Optional. Authentication type used by the CyberArk Central Credential Provider web services: none, basic, or digest.

    Default is none.

    credentialStore.<cstore ID>.config.ws.http.authentication.user Optional. User name if using basic or digest authentication.
    credentialStore.<cstore ID>.config.ws.http.authentication.password Optional. Password if using basic or digest authentication.

    To protect the password, store the password in an external location and then use a function to retrieve the password.

    credentialStore.<cstore ID>.config.ws.truststoreFile Optional. Path to the truststore file if using HTTPS and the server certificate is using a private CA or is not trusted by the Java default truststore file.

    Enter a path relative to the Transformer configuration directory, $TRANSFORMER_CONF.

    credentialStore.<cstore ID>.config.ws.truststorePassword Optional. Password for the truststore file.

    To protect the password, store the password in an external location and then use a function to retrieve the password.

    credentialStore.<cstore ID>.config.ws.supportedProtocols Optional. SSL/TLS-enabled protocols. Versions TLSv1.2 or later are recommended.
    credentialStore.<cstore ID>.config.ws.hostnameVerifier.skip Optional. Determines whether the host name of the CyberArk Central Credential Provider web services should be verified against the domain defined in the HTTPS certificate.

    By default, the host name is verified.

    credentialStore.<cstore ID>.config.ws.keystoreFile Optional. If using HTTPS and the CyberArk Central Credential Provider web services requires client side certificates, the path to the keystore file that contains the client certificate.

    Enter a path relative to the Transformer configuration directory, $TRANSFORMER_CONF.

    credentialStore.<cstore ID>.config.ws.keystorePassword Optional. Password for the keystore file.

    To protect the password, store the password in an external location and then use a function to retrieve the password.

    credentialStore.<cstore ID>.config.ws.keyPassword Optional. Password to access the certificate within the keystore file.

    To protect the password, store the password in an external location and then use a function to retrieve the password.

    credentialStore.<cstore ID>.config.enforceEntryGroup Optional. Requires Transformer to verify if the user who previews, validates, or starts the pipeline belongs to a group that is permitted to access the secret.

    When set to true, each secret must have a corresponding <secret name>-groups secret that contains a comma-separated list of groups that is permitted to access the secret.

    For more information, see Group Access to Secrets.

    Default is false.

  3. Restart Transformer to enable the changes.

Step 2. Call Secrets from the Pipeline

Specify credential functions in stage or pipeline properties to retrieve secrets stored in CyberArk.

Use the credential functions in any stage property that displays the key icon next to it. For example:

Important: When you use a credential function in a stage or pipeline property, the function must be the only value defined in the property.

For details about credential functions, see Credential Functions.

Google Secret Manager

To use a Google Secret Manager credential store system, install the Google Secret Manager Credentials Store stage library and define the configuration properties used to connect to Secret Manager. Then, use a credential function in pipeline stage properties to retrieve stored values.

As a best practice, make secrets read-only and limit access. For additional suggestions, see the Google Secret Manager best practices documentation.

Note: This documentation includes Secret Manager information needed for the configuration process. For more information about Secret Manager, see the Google Secret Manager documentation.

Authentication

Transformer must authenticate with Google Secret Manager using Google credentials.

When you configure the credential store properties, you configure Transformer to use one of the following credential modes:
Default
Transformer authenticates with Google Secret Manager using the credentials file defined in the GOOGLE_APPLICATION_CREDENTIALS environment variable.
Set the environment variable on the Transformer machine. If you run Transformer on a VM on Google Cloud Platform, use an instance service account with access to Google Secret Manager.
For more information about using default credentials, see the Google Cloud documentation.
JSON
Transformer authenticates with Google Secret Manager using JSON-formatted credential information specified in the credential store configuration properties. You copy the JSON content from a Google Cloud service account credentials file.
Enter the JSON content in plain text. If the content includes multiple lines of text, add a backslash (\) at the end of each line.
JSON Path
Transformer authenticates with Google Secret Manager using a Google Cloud service account credentials file. Store the file in the same location on the Transformer machine and on each node in the Spark cluster.
Enter an absolute path to the file in the credential store configuration properties.

For information about generating a service account credential file, see the Google Cloud Platform documentation.

Step 1. Configure the Credential Store Properties

To enable Transformer to connect to the Google Secret Manager credential store, configure the Secret Manager properties in the $TRANSFORMER_CONF/credential-stores.properties file.

  1. Uncomment the credentialStores property in the file and specify the credential store ID to use. Use only alphabetic characters for the credential store ID.

    By default, the property lists a default credential store ID for each type of credential store, aws for AWS Secrets Manager, azure for Azure Key Vault, and so on. When using one credential store of any type, it's simplest to use the default value.

    To use just a single Secret Manager, set the value to gcp.

    To enable multiple credential stores, specify a comma-separated list of credential store IDs. For example, to use a Java keystore and a Secret Manager credential store, set the value to jks,gcp. To use multiple Secret Manager credential stores, simply specify separate IDs for each, such as gcpDev,gcpProd.

  2. Uncomment and configure the following properties as needed.

    If you specified a custom credential store ID, update the names of the following properties, and then configure them as needed. When using the default credential store ID, gcp, leave the property names intact, and simply configure the properties.

    To use multiple Google Secret Manager credential stores, make a copy of the properties for each credential store. Then, update the credential store ID in each set of property names before defining the properties. For an example, see Enabling Credential Stores.

    Important: Instead of entering sensitive data such as passwords in clear text in the configuration file, you can protect the sensitive data by storing the data in an external location and then using functions to retrieve the data.

    The properties are grouped in the Google Secret Manager section of the file:

    Secret Manager Property Description
    credentialStore.<cstore ID>.def Required. Defines the implementation of the Google Secret Manager credential store.

    Do not change the default value.

    credentialStore.<cstore ID>.config.cache.inactivityExpiration.millis Expiration time for the cache in milliseconds.

    Default is 1800000.

    credentialStore.<cstore ID>.config.delimiter Delimiter to use in the credential function name argument to separate the secret name and the version ID. Use a single character that is not included in credential names.

    Use the following format for the name argument:

    <name><delimiter><version id>

    For example, if you use a slash, the format for the name argument is:

    <name>/<version id>

    Default is question mark (?).

    credentialStore.<cstore ID>.config.project.id ID of the project associated with Secret Manager.
    credentialStore.<cstore ID>.config.credentialsMode Credentials to use for authentication with Secret Manager:
    • default - Uses Google Cloud default credentials.
    • json - Uses JSON-formatted credentials information specified in the credential store configuration properties.
    • jsonPath - Uses a JSON service account credentials file stored in the same location on the Transformer machine and on each node in the Spark cluster.

    For more information, see Authentication.

    credentialStore.<cstore ID>.config.credentialsJson Contents of a Google Cloud service account credentials JSON file.

    Enter JSON-formatted credential information in plain text. If the content includes multiple lines of text, add a backslash (\) at the end of each line.

    Required when using the json credentials mode.

    credentialStore.<cstore ID>.config.credentialsJsonPath Absolute path to the Google Cloud service account credentials file stored in the same location on the Transformer machine and on each node in the Spark cluster. The credentials file must be a JSON file.

    Required when using the jsonPath credentials mode.

    credentialStore.<cstore ID>.config.enforceEntryGroup Optional. Requires Transformer to verify if the user who previews, validates, or starts the pipeline belongs to a group that is permitted to access the secret.

    When set to true, each secret must have a corresponding <secret name>-groups secret that contains a comma-separated list of groups that is permitted to access the secret.

    For more information, see Group Access to Secrets.

    Default is false.

  3. Restart Transformer to enable the changes.

Step 2. Call Secrets from the Pipeline

Specify credential functions in stage or pipeline properties to retrieve secrets stored in Google Secret Manager.

You can configure credential functions in any property that displays the key icon next to it. For example:

Important: When you use a credential function in a stage or pipeline property, the function must be the only value defined in the property.

For details about credential functions, see Credential Functions.

Hashicorp Vault

To use the Hashicorp Vault credential store system, install the Vault Credential Store stage library and define the configuration properties used to connect to Hashicorp Vault. Then, use credential functions in pipeline stage properties to retrieve stored values.

Note: This documentation includes details about Hashicorp Vault to simplify the configuration process. For more information, see the Hashicorp Vault documentation.

Step 1. Configure the Credential Store Properties

To enable Transformer to connect to the Hashicorp Vault credential store, configure the Hashicorp Vault properties in the $TRANSFORMER_CONF/credential-stores.properties file.

  1. Uncomment the credentialStores property in the file and specify the credential store ID to use. Use only alphabetic characters for the credential store ID.

    By default, the property lists a default credential store ID for each type of credential store, aws for AWS Secrets Manager, azure for Azure Key Vault, and so on. When using one credential store of any type, it's simplest to use the default value.

    To use just a single Hashicorp Vault credential store, set the value to vault.

    To enable multiple credential stores, specify a comma-separated list of credential store IDs. For example, to use a Java keystore and a Hashicorp Vault credential store, set the value to jks,vault. To use multiple Hashicorp Vault credential stores, simply specify separate IDs for each, such as vaultDev,vaultProd.

  2. Uncomment and configure the following properties as needed.

    If you specified a custom credential store ID, update the names of the following properties, and then configure them as needed. When using the default credential store ID, vault , leave the property names intact, and simply configure the properties.

    To use multiple Hashicorp Vault credential stores, make a copy of the properties for each credential store. Then, update the credential store ID in each set of property names before defining the properties. For an example, see Enabling Credential Stores.

    Important: Instead of entering sensitive data such as passwords in clear text in the configuration file, you can protect the sensitive data by storing the data in an external location and then using functions to retrieve the data.

    These properties are grouped in the Hashicorp Vault section of the file:

    Secrets Manager Property Description
    credentialStore.<cstore ID>.def Required. Defines the implementation of the Vault credential store.

    Do not change the default value.

    credentialStore.<cstore ID>.config.pathKey.separator Optional. Separator to use in the name argument that credential functions use.

    Use the following format for the name argument:

    <path><separator><key>
    For example, if you keep the default ampersand (&), the format for the name argument is:
    <path>&<key>
    credentialStore.<cstore ID>.config.addr Required. Vault server URL entered in the following format:
    https://<host name>:<port number>

    Use HTTPS to avoid unencrypted communication. The Transformer machine and each node in the Spark cluster must have access to this URL.

    credentialStore.<cstore ID>.config.role.id Required. Vault Role ID that Transformer uses to authenticate with Vault. The Role ID is configured within Vault by your Vault administrator.
    The Transformer Vault integration relies on Vault's App Role authentication backend.
    Important: The App ID authentication backend has been deprecated by Hashicorp and will be removed in a future release. As a result, do not configure the credentialStore.vault.config.app.id property for new installations.
    credentialStore.<cstore ID>.config.secret.id Required. Vault Secret ID that Transformer uses to authenticate with Vault. The Secret ID is configured within Vault by your Vault administrator.

    To protect the Secret ID, store the Secret ID in an external location and then use a function to retrieve the Secret ID.

    Default uses the file function to retrieve the Secret ID from vault-secret-id in the Transformer configuration directory, $TRANSFORMER_CONF.

    credentialStore.<cstore ID>.config.version Version of the KV secrets engine used by Vault. Enter 1 or 2.

    Default is 1.

    credentialStore.<cstore ID>.config.lease.renewal.interval.sec Optional. Seconds to wait before checking for leases that need renewal.

    Default is 60.

    credentialStore.<cstore ID>.config.lease.expiration.buffer.sec Optional. Buffer for expiring leases. Transformer renews leases that expire in less than the specified number of seconds.

    Default is 120.

    credentialStore.<cstore ID>.config.open.timeout Optional. Timeout to establish an HTTP connection to Vault in milliseconds.

    Default is 0 for no limit.

    credentialStore.<cstore ID>.config.proxy.address Optional. Proxy URL. Configure to use a proxy to access Vault.
    credentialStore.<cstore ID>.config.proxy.port Optional. Proxy port. Configure to use a proxy to access Vault.
    credentialStore.<cstore ID>.config.proxy.username Optional. Proxy username. Configure to use a proxy to access Vault.
    credentialStore.<cstore ID>.config.proxy.password Optional. Proxy password. Configure to use a proxy to access Vault.

    To protect the password, store the password in an external location and then use a function to retrieve the password.

    credentialStore.<cstore ID>.config.read.timeout Optional. Milliseconds to wait for data before timing out.

    Default is 0 for no limit.

    credentialStore.<cstore ID>.config.ssl.enabled.protocols Optional. SSL/TLS-enabled protocols. Versions TLSv1.2 or later are recommended.

    Default is TLSv1.2,TLSv1.3.

    credentialStore.<cstore ID>.config.ssl.truststore.file Optional. Path to a Java truststore file. Required when using a private CA or certificates not trusted by the Java default truststore.
    credentialStore.<cstore ID>.config.ssl.truststore.password Optional. Password for the truststore file.

    To protect the password, store the password in an external location and then use a function to retrieve the password.

    credentialStore.<cstore ID>.config.ssl.verify Optional. Whether to verify that the Vault server hostname matches its certificate.

    Default is true. False is not recommended.

    credentialStore.<cstore ID>.config.ssl.timeout Optional. Timeout for the SSL/TLS handshake in milliseconds.

    Default is 0 for no limit.

    credentialStore.<cstore ID>.config.timeout Optional. Timeout to read from Vault in milliseconds, after a connection has been established.

    Default is 0 for no limit.

    credentialStore.<cstore ID>.config.enforceEntryGroup Optional. Requires Transformer to verify if the user who previews, validates, or starts the pipeline belongs to a group that is permitted to access the secret.

    When set to true, each secret must have a corresponding <secret name>-groups secret that contains a comma-separated list of groups that is permitted to access the secret.

    For more information, see Group Access to Secrets.

    Default is false.

  3. Restart Transformer to enable the changes.

Step 2. Call Secrets from the Pipeline

Specify credential functions in stage or pipeline properties to retrieve secrets stored in Hashicorp Vault.

Use the credential functions in any stage property that displays the key icon next to it. For example:

Important: When you use a credential function in a stage or pipeline property, the function must be the only value defined in the property.

For details about credential functions, see Credential Functions.

Java Keystore

To use the Java keystore credential store system, install the Java keystore credential store stage library and define the configuration properties used to connect to the credential store.

Use the stagelib-cli jks-credentialstore command to add secrets to the credential store. Then, use credential functions in stage or properties to retrieve those secrets.
Important: Use the Java keystore credential store system in development environments only.

A Java keystore credential storage system requires the distribution of a keystore file, which complicates security. Before using a Java keystore system, decide how the keystore will be distributed and consult with your IT security team to ensure that the system meets IT policies.

Step 1. Configure Credential Store Properties

To enable Transformer to connect to the Java keystore credential store, configure the Java keystore properties in the $TRANSFORMER_CONF/credential-stores.properties file.

  1. Uncomment the credentialStores property in the file and specify the credential store ID to use. Use only alphabetic characters for the credential store ID.

    By default, the property lists a default credential store ID for each type of credential store, aws for AWS Secrets Manager, azure for Azure Key Vault, and so on. When using one credential store of any type, it's simplest to use the default value.

    To use just a single Java keystore, set the value to jks.

    To enable multiple credential stores, specify a comma-separated list of credential store IDs. For example, to use a Java keystore and a Secrets Manager credential store, set the value to jks,aws. To use multiple Java keystore credential stores, simply specify separate IDs for each, such as jksDev,jksProd.

  2. Uncomment and configure the following properties as needed.

    If you specified a custom credential store ID, update the names of the following properties, and then configure them as needed. When using the default credential store ID, jks, leave the property names intact, and simply configure the properties.

    To use multiple Java keystore credential stores, make a copy of the properties for each credential store. Then, update the credential store ID in each set of property names before defining the properties. For an example, see Enabling Credential Stores.

    Important: Instead of entering sensitive data such as passwords in clear text in the configuration file, you can protect the sensitive data by storing the data in an external location and then using functions to retrieve the data.

    These properties are grouped in the Java keystore section of the file:

    Java Keystore Property Description
    credentialStore.<cstore ID>.def Required. Defines the implementation of the Java Keystore credential store.

    Do not change the default value.

    credentialStore.<cstore ID>.config.keystore.type Required. Format of the Java keystore file:
    • JCEKS
    • PKCS12

    Default is PKCS12.

    credentialStore.<cstore ID>.config.keystore.file Required. Path and name of the Java keystore file. Enter an absolute path to the file, or a path relative to the Transformer configuration directory, $TRANSFORMER_CONF.

    Default is jks-credentialStore.pkcs12.

    credentialStore.<cstore ID>.config.keystore.storePassword Required. Password that Transformer uses to access the Java keystore file.

    You must change the default value before using the keystore file.

    To protect the password, store the password in an external location and then use a function to retrieve the password.

    credentialStore.<cstore ID>.config.keystore.file.min.refresh.millis Milliseconds that Transformer waits before reloading the keystore file.

    Default is 10000, or ten seconds.

  3. Restart Transformer to enable the changes.

Step 2. Add Secrets to the Java Keystore

Use the stagelib-cli jks-credentialstore command to define secrets in the Java keystore file. You can define multiple secrets in the file.

Use the command from the $TRANSFORMER_DIST directory as follows:

bin/streamsets stagelib-cli jks-credentialstore add -i <cstore ID> -n <secret name> -c <secret value>

For example, the following command adds a secret named OracleDBPassword with the value 278yT6u to the jks Java keystore credential store:

bin/streamsets stagelib-cli jks-credentialstore add -i jks -n OracleDBPassword -c 278yT6u
Note: The stagelib-cli jks-credentialstore command also includes delete and list subcommands that you use to manage the secrets defined in the keystore file. For information on using these commands, see jks-credentialstore Command.

Step 3. Call Secrets from the Pipeline

Specify the credential:get() function in stage or pipeline properties to call secrets from a Java keystore.

You can configure a credential function in any property that displays the key icon next to it. For example:

Important: When you use a credential function in a stage or pipeline property, the function must be the only value defined in the property.
For details about the credential:get() function, see Credential Functions.

jks-credentialstore Command

The stagelib-cli jks-credentialstore command provides subcommands to add, list, and delete secrets in the Java keystore credential store.

Any changes made to the Java keystore file take effect immediately. For example, if you change the value of an existing secret in the file, running pipelines that require a new connection to the external system use the updated value.

You can use the following subcommands with the stagelib-cli jks-credentialstore command:
add
Adds a secret to the Java keystore credential store.
Use the command from the $TRANSFORMER_DIST directory as follows:
bin/streamsets stagelib-cli jks-credentialstore add \
(-i <cstore ID> | --id <cstore ID>) \
(-n <secret name> | --name <secret name>) \
(-c <secret value> | --credential <secret value>)
Add Option Description
-i <cstore ID>

or

--id <cstore ID>

Required. Unique ID for the credential store.

The default ID for a Java keystore is jks.

-n <secret name>

or

--name <secret name>

Required. Name of the secret to add to the Java keystore credential store.

If the name includes non-alphanumeric characters, use single quotation marks around the name.

-c <secret value>

or

--credential <secret value>

Required. Value of the secret to add to the Java keystore credential store.

If the value includes non-alphanumeric characters, use single quotation marks around the value.

For example, the following command adds a secret named OracleDBPassword with the value df35yT_&5 to the devjks Java keystore credential store:

bin/streamsets stagelib-cli jks-credentialstore add -i devjks -n OracleDBPassword -c 'df35yT_&5'
delete
Deletes a secret from the Java keystore credential store.
Use the command from the $TRANSFORMER_DIST directory as follows:
bin/streamsets stagelib-cli jks-credentialstore delete \
(-i <cstore ID> | --id <cstore ID>) \
(-n <secret name> | --name <secret name>)
Delete Option Description
-i <cstore ID>

or

--id <cstore ID>

Required. Unique ID for the credential store.

The default ID for a Java keystore is jks.

-n <secret name>

or

--name <secret name>

Required. Name of the secret to delete from the Java keystore credential store.

If the name includes non-alphanumeric characters, use single quotation marks around the name.

For example, the following command deletes a secret named SQLServerDBPassword from the devjks Java keystore credential store:
bin/streamsets stagelib-cli jks-credentialstore delete -i devjks -n SQLServerDBPassword
list
Lists the names of all secrets defined in the Java keystore credential store. The command does not list the secret values.
Use the command from the $TRANSFORMER_DIST directory as follows:
bin/streamsets stagelib-cli jks-credentialstore list \
(-i <cstore ID> | --id <cstore ID>)
List Option Description
-i <cstore ID>

or

--id <cstore ID>

Required. Unique ID for the credential store.

The default ID for a Java keystore is jks.

For example, the following command lists the names of all secrets defined in the devjks Java keystore credential store:
bin/streamsets stagelib-cli jks-credentialstore list -i devjks