Credential Stores
You can configure Transformer to access sensitive information that is secured in a credential store.
Transformer pipelines communicate with external systems to perform tasks such as launching a Spark application, or reading and writing data. Most of these external systems require sensitive information, such as user names or passwords, to access the system. When you configure pipeline stages for these external systems, you must specify the details that the stages need to connect to the system.
If you enter sensitive information directly in stage and pipeline properties, you expose those details to any user with access to the pipeline. To access external systems without exposing sensitive details, add them as secrets to a credential store and then use StreamSets credential functions in stage and pipeline properties to retrieve those values.
Defining secrets in a credential store can make it easier to migrate pipelines to another environment. For example, if you migrate multiple pipelines from a development to a production environment, you do not need to edit each pipeline with details for the production environment. You can simply replace the development credential store with the production version.
You can configure Transformer to use multiple credential stores. Each credential store is identified by a unique credential store ID.
- AWS Secrets Manager
- Azure Key Vault
- CyberArk
- Google Secret Manager
- Hashicorp Vault
- Java keystoreImportant: Use a Java keystore in a development environment only. In a production environment, use a centralized keystore, such as Azure Key Vault, to better secure sensitive information.
- Thycotic Secret Server
Enabling Credential Stores
You can configure Transformer to use one or more credential stores. Each credential store is identified by a unique credential store ID.
- credentialStores property
- This property defines the credential stores that Transformer can use.
- usePortableGroups property
- This property allows you to migrate pipelines that access a credential store
from one Control Hub organization to another without updating the pipeline. Important: Use this property only when recommended by the StreamSets Support team.
- Sets of related properties
- Each supported credential store type has a set of related properties. The
property names include the default credential store IDs originally specified
in the
credentialStores
property.
For example, say you want to use two Azure credential stores, azureDev
for development and azureProd
for production. To do this, you specify
the credential store IDs in the credentialStores
property and make a
copy of the related Azure credential store properties, so you have one set for each
credential store.
azureDev
, and you do
the same for azureProd
. The resulting properties might look as follows,
with important changes
highlighted:################################################
# Transformer Credential Stores #
################################################
credentialStores=azureDev,azureProd
#credentialStores.usePortableGroups=false
############################################################
# azureDev: Azure Key Vault Credential Store Configuration #
############################################################
credentialStore.azureDev.def=streamsets-transformer-azure-keyvault-credentialstore-lib::com_streamsets_datacollector_credential_azure_keyvault_AzureKeyVaultCredentialStore
credentialStore.azureDev.config.credential.refresh.millis=30000
credentialStore.azureDev.config.credential.retry.millis=15000
credentialStore.azureDev.config.vault.url=https://development.vault.azure.net/
credentialStore.azureDev.config.client.id=devClientID
credentialStore.azureDev.config.client.key=devClientKey
credentialStore.azureDev.config.enforceEntryGroup=false
#############################################################
# azureProd: Azure Key Vault Credential Store Configuration #
#############################################################
credentialStore.azureProd.def=streamsets-transformer-azure-keyvault-credentialstore-lib::com_streamsets_datacollector_credential_azure_keyvault_AzureKeyVaultCredentialStore
credentialStore.azureProd.config.credential.refresh.millis=30000
credentialStore.azureProd.config.credential.retry.millis=15000
credentialStore.azureProd.config.vault.url=https://production.vault.azure.net/
credentialStore.azureProd.config.client.id=prodClientID
credentialStore.azureProd.config.client.key=prodClientKey
credentialStore.azureProd.config.enforceEntryGroup=false
Group Access to Secrets
As an additional layer of security, you can employ user groups to further limit access to the secrets defined in credential stores.
- Required group argument in credential functions
- Credential functions include a group argument that defines the group that can access the secret. The group argument ensures that the user who attempts to preview, validate, or start a pipeline that includes a credential function belongs to the group specified in the function. The user must also have execute permission on the pipeline.
- Optional group secrets in the credential store
In addition to using the group argument in credential functions, you can configure Transformer to require group secrets for a credential store.
To require the use of group secrets, in the Transformer credential store configuration properties, set the
credentialStore.<cstore ID>.config.enforceEntryGroup
property totrue
.A group secret is a secret defined in the credential store that contains a comma-delimited list of user groups permitted to access the associated secret.
When the credential store ID requires group secrets, you must define a group secret for every secret that Transformer accesses in that credential store. The name of the group secret is based on the secret name, as follows:
When you configure a credential function to call a secret, the user group specified in the credential function must be listed in the associated group secret that is defined in the credential store.<secret name>-groups
azure
credential
store:${credential:get("azure", "production@9a213-b18-1eb-b9c-15ad68", sharedAccessKey)}
When you run the pipeline, Transformer
validates all of the following: - The user who starts the pipeline is in the
production
user group. - The
sharedAccessKey
secret has an associatedsharedAccessKey-groups
secret defined in the credential store. - The
sharedAccessKey-groups
secret includes theproduction
user group.
When Transformer is not configured to require group secrets, Transformer validates only the first point, verifying that the user belongs to the specified group.
Sensitive Data in Credential Store Properties
When you define credential store properties, you must enter some sensitive data such as passwords to authenticate with the credential store system. For example, to use the AWS Secrets Manager credential store system, you enter the AWS access key ID and secret access key.
To prevent exposing the sensitive data in the Control Hub deployment details, Control Hub displays the sensitive values as REDACTED after you save the deployment.
Alternatively, instead of entering sensitive data such as passwords in the configuration properties, you can protect the sensitive data by storing the data in an external location and then using functions to retrieve the data. When you use functions in credential store properties, Control Hub does not redact the credential store property values. Control Hub displays the defined functions after you save the deployment.
AWS Secrets Manager
To use the AWS Secrets Manager credential store system, install the AWS Secrets Manager credential store stage library and define the configuration properties used to connect to Secrets Manager. Then, use credential functions in pipeline stage properties to retrieve stored values.
In Secrets Manager, you must configure an access and secret key pair with correct permission to read the key. To follow best practices, make secrets read-only and limit access. See the Secrets Manager documentation on identity and access management (IAM) policies.
Step 1. Install the Credential Store Stage Library
Before you use Secrets Manager, the AWS Secrets Manager Credential Store stage library must be included in the engine deployment and installed on the engine. The stage library is a common stage library. For details on adding stage libraries to a deployment, see the Control Hub documentation.
Step 2. Configure the Credential Store Properties
To enable Transformer to connect to the AWS Secrets Manager credential store, configure the Secrets Manager properties in the Transformer credential store configuration properties.
- In Control Hub, edit the deployment, and in the Configure Engine section, click Advanced Configuration. Then, click Credential Stores.
- Uncomment the credentialStores
property in the file and specify the credential store ID to use. Use only alphabetic
characters for the credential store ID.
By default, the property lists a default credential store ID for each type of credential store,
aws
for AWS Secrets Manager,azure
for Azure Key Vault, and so on. When using one credential store of any type, it's simplest to use the default value.To use just a single Secrets Manager, set the value to
aws
.To enable multiple credential stores, specify a comma-separated list of credential store IDs. For example, to use a Java keystore and a Secrets Manager credential store, set the value to jks,aws. To use multiple Secrets Manager credential stores, simply specify separate IDs for each, such as
awsDev,awsProd
. - Uncomment and configure the following properties as
needed.
If you specified a custom credential store ID, update the names of the following properties, and then configure them as needed. When using the default credential store ID,
aws
, leave the property names intact, and simply configure the properties.To use multiple AWS Secrets Manager credential stores, make a copy of the properties for each credential store. Then, update the credential store ID in each set of property names before defining the properties. For an example, see Enabling Credential Stores.
Note: Control Hub displays sensitive data such as passwords asREDACTED
after you save the deployment.These properties are grouped in the AWS Secrets Manager section of the file:
Secrets Manager Property Description credentialStore.<cstore ID>.def Required. Defines the implementation of the AWS Secrets Manager credential store. Do not change the default value.
credentialStore.<cstore ID>.config.nameKey.separator Optional. Separator to use in the name
argument for credential functions.Note: In Secrets Manager, names can contain alphanumeric and the following special characters:/ _ + = . @ -
. Therefore, avoid using those characters as separators.credentialStore.<cstore ID>.config.region Required. AWS region that hosts Secrets Manager. For a list of available regions, see the AWS Region Table. credentialStore.<cstore ID>.config.security.method Required. Authentication method used to connect to AWS. Set to one of the following values: instanceProfile
- Authenticates using an instance profile associated with Transformer.Use when Transformer runs on an Amazon EC2 instance that has an associated instance profile. Transformer uses the instance profile credentials to automatically authenticate with AWS.
accessKeys
- Authenticates using an AWS access key pair.Use when Transformer does not run on an Amazon EC2 instance or when the EC2 instance doesn’t have an instance profile.
credentialStore.<cstore ID>.config.access.key Required when using access keys to authenticate with AWS. AWS access key. credentialStore.<cstore ID>.config.secret.key Required when using access keys to authenticate with AWS. AWS secret key. credentialStore.<cstore ID>.config.cache.max.size Optional. Maximum number of secrets Transformer can cache locally. Default is 1024. credentialStore.<cstore ID>.config.cache.ttl.millis Optional. Number of milliseconds that Transformer considers a cached secret valid before requiring a refresh. Default is 1 hour. credentialStore.<cstore ID>.config.enforceEntryGroup Optional. Requires Transformer to verify if the user who previews, validates, or starts the pipeline belongs to a group that is permitted to access the secret. When set to true, each secret must have a corresponding
<secret name>-groups
secret that contains a comma-separated list of groups that is permitted to access the secret.For more information, see Group Access to Secrets.
Default is false.
- Save the changes to the deployment and restart all engine instances.
Step 3. Call Secrets from the Pipeline
Specify credential functions in stage or pipeline properties to retrieve secrets stored in AWS Secrets Manager.
Use the credential functions in any stage property that displays the key icon next to it. For example:
For details about credential functions, see Credential Functions.
Azure Key Vault
Before Transformer can connect to the Microsoft Azure Key Vault credential store system, you must complete several prerequisites in Azure so that Transformer can access the Azure Key Vault as an application.
After completing the prerequisites, install the Azure Key Vault credential store stage library and define the configuration properties used to connect to Azure Key Vault. Then, define credential functions in stage or pipeline properties to retrieve stored values.
Prerequisites
Before Transformer can connect to the Microsoft Azure Key Vault credential store system, you must complete the following prerequisites within Azure:
- Register Transformer with Azure Active Directory
- Use the Azure portal to register Transformer as an application in Azure Active Directory. When an application such as Transformer accesses secrets in an Azure key vault, the application must use an authentication token from Azure Active Directory.
- Authorize Transformer to use keys or secrets in the Azure key vault
- Use the Azure portal to authorize Transformer to use the keys or secrets in the Azure key vault. Azure Key Vault requires that applications be authorized to access each key vault.
Step 1. Install the Credential Store Stage Library
Before you use Azure Key Vault, the Azure Key Vault Credential Store stage library must be included in the engine deployment and installed on the engine. The stage library is a common stage library. For details on adding stage libraries to a deployment, see the Control Hub documentation.
Step 2. Configure the Credential Store Properties
To enable Transformer to connect to the Azure Key Vault credential store, configure the Azure Key Vault properties in the Transformer credential store configuration properties.
- In Control Hub, edit the deployment, and in the Configure Engine section, click Advanced Configuration. Then, click Credential Stores.
- Uncomment the credentialStores
property in the file and specify the credential store ID to use. Use only alphabetic
characters for the credential store ID.
By default, the property lists a default credential store ID for each type of credential store,
aws
for AWS Secrets Manager,azure
for Azure Key Vault, and so on. When using one credential store of any type, it's simplest to use the default value.To use just a single Azure Key Vault, set the value to
azure
.To enable multiple credential stores, specify a comma-separated list of credential store IDs. For example, to use a Java keystore and an Azure Key Vault credential store, set the value to jks,azure. To use multiple Azure Key Vault credential stores, simply specify separate IDs for each, such as
azureDev,azureProd
. - Uncomment and configure the following properties as
needed.
If you specified a custom credential store ID, update the names of the following properties, and then configure them as needed. When using the default credential store ID,
azure
, leave the property names intact, and simply configure the properties.To use multiple Azure Key Vault credential stores, make a copy of the properties for each credential store. Then, update the credential store ID in each set of property names before defining the properties. For an example, see Enabling Credential Stores.
Note: Control Hub displays sensitive data such as passwords asREDACTED
after you save the deployment.The properties are grouped in the Azure Key Vault section of the file:
Azure Key Vault Property Description credentialStore.<cstore ID>.def Required. Defines the implementation of the Azure Key Vault credential store. Do not change the default value.
credentialStore.<cstore ID>.config.credential.refresh.millis Optional. Number of milliseconds that Transformer locally caches a secret. When the time expires, Transformer retrieves the secret from Azure Key Vault. credentialStore.<cstore ID>.config.credential.retry.millis Optional. Number of milliseconds that Transformer waits before attempting to retry a retrieval of a secret from Azure Key Vault, in the case of an error. credentialStore.<cstore ID>.config.vault.url Required. URL to the key vault created in Azure Key Vault. Use the following format:
https://<key vault name>.vault.azure.net/
credentialStore.<cstore ID>.config.client.id Required. Application ID assigned to this Transformer when you registered Transformer as an application in Azure Active Directory, as described in Prerequisites. credentialStore.<cstore ID>.config.client.key Required. Authentication key assigned to this Transformer when you registered Transformer as an application in Azure Active Directory, as described in Prerequisites. credentialStore.<cstore ID>.config.enforceEntryGroup Optional. Requires Transformer to verify if the user who previews, validates, or starts the pipeline belongs to a group that is permitted to access the secret. When set to true, each secret must have a corresponding
<secret name>-groups
secret that contains a comma-separated list of groups that is permitted to access the secret.For more information, see Group Access to Secrets.
Default is false.
- Save the changes to the deployment and restart all engine instances.
Step 3. Call Secrets from the Pipeline
Specify credential functions in stage or pipeline properties to retrieve secrets stored in Azure Key Vault.
You can configure credential functions in any property that displays the key icon next to it. For example:
For details about credential functions, see Credential Functions.
CyberArk
To use the CyberArk credential store system, install the CyberArk Credential Store stage library and define the configuration properties used to connect to CyberArk Application Identity Manager. Then, use credential functions in pipeline stage properties to retrieve stored values.
Step 1. Install the Credential Store Stage Library
Before you use CyberArk, the CyberArk Credential Store stage library must be included in the engine deployment and installed on the engine. The stage library is a common stage library. For details on adding stage libraries to a deployment, see the Control Hub documentation.
Step 2. Configure the Credential Store Properties
To enable Transformer to connect to the CyberArk credential store, configure the CyberArk properties in the Transformer credential store configuration properties.
- In Control Hub, edit the deployment, and in the Configure Engine section, click Advanced Configuration. Then, click Credential Stores.
-
Uncomment the credentialStores
property in the file and specify the credential store ID to use. Use only alphabetic
characters for the credential store ID.
By default, the property lists a default credential store ID for each type of credential store,
aws
for AWS Secrets Manager,azure
for Azure Key Vault, and so on. When using one credential store of any type, it's simplest to use the default value.To use just a single CyberArk credential store, set the value to
cyberark
.To enable multiple credential stores, specify a comma-separated list of credential store IDs. For example, to use a Java keystore and a CyberArk credential store, set the value to jks,cyberark. To use multiple CyberArk credential stores, simply specify separate IDs for each, such as
cyberarkDev,cyberarkProd
. - Uncomment and configure the following properties as
needed.
If you specified a custom credential store ID, update the names of the following properties, and then configure them as needed. When using the default credential store ID,
To use multiple CyberArk credential stores, make a copy of the properties for each credential store. Then, update the credential store ID in each set of property names before defining the properties. For an example, see Enabling Credential Stores.cyberark
, leave the property names intact, and simply configure the properties.Note: Control Hub displays sensitive data such as passwords asREDACTED
after you save the deployment.These properties are grouped in the CyberArk section of the file:
CyberArk Property Description credentialStore.<cstore ID>.def Required. Defines the implementation of the CyberArk credential store. Do not change the default value.
credentialStore.<cstore ID>.config.credential.refresh.millis Optional. Number of milliseconds that Transformer locally caches a credential. When the time expires, Transformer retrieves the credential from CyberArk. credentialStore.<cstore ID>.config.credential.retry.millis Optional. Number of milliseconds that Transformer waits before attempting to retry a retrieval of a credential from CyberArk, in the case of an error. credentialStore.<cstore ID>.config.connector Optional. Connector type to CyberArk. Leave the default, webservices
, since only web services is currently supported.credentialStore.<cstore ID>.config.ws.url Required. CyberArk Central Credential Provider web service URL. Use the following format:
https://<host name>:<port>/AIMWebService/api/Accounts
credentialStore.<cstore ID>.config.ws.appId Required. CyberArk application ID for this Transformer. You must create the application ID in CyberArk. credentialStore.<cstore ID>.config.ws.maxConcurrentConnections Optional. Maximum number of concurrent web service calls that Transformer can make to CyberArk. credentialStore.<cstore ID>.config.ws.validateAfterInactivity.millis Optional. Number of milliseconds of inactivity before Transformer validates the HTTP connection to CyberArk. credentialStore.<cstore ID>.config.ws.connectionTimeout.millis Optional. Number of milliseconds to wait for a connection to CyberArk. credentialStore.<cstore ID>.config.ws.nameSeparator Optional. Separator to use in the name
argument that credential functions use.Use the following format for thename
argument:<safe><separator><folder><separator><object name><separator><element name>
For example, if you keep the default ampersand (&), the format for the name argument is:<safe>&<folder>&<object name>&<element name>
credentialStore.<cstore ID>.config.ws.http.authentication Optional. Authentication type used by the CyberArk Central Credential Provider web services: none, basic, or digest. Default is none.
credentialStore.<cstore ID>.config.ws.http.authentication.user Optional. User name if using basic or digest authentication. credentialStore.<cstore ID>.config.ws.http.authentication.password Optional. Password if using basic or digest authentication. To protect the password, store the password in an external location and then use a function to retrieve the password.
credentialStore.<cstore ID>.config.ws.truststoreFile Optional. Path to the truststore file if using HTTPS and the server certificate is using a private CA or is not trusted by the Java default truststore file. Enter a path relative to the Transformer configuration directory.
credentialStore.<cstore ID>.config.ws.truststorePassword Optional. Password for the truststore file. To protect the password, store the password in an external location and then use a function to retrieve the password.
credentialStore.<cstore ID>.config.ws.supportedProtocols Optional. SSL/TLS-enabled protocols. Versions TLSv1.2 or later are recommended. credentialStore.<cstore ID>.config.ws.hostnameVerifier.skip Optional. Determines whether the host name of the CyberArk Central Credential Provider web services should be verified against the domain defined in the HTTPS certificate. By default, the host name is verified.
credentialStore.<cstore ID>.config.ws.keystoreFile Optional. If using HTTPS and the CyberArk Central Credential Provider web services requires client side certificates, the path to the keystore file that contains the client certificate. Enter a path relative to the Transformer configuration directory.
credentialStore.<cstore ID>.config.ws.keystorePassword Optional. Password for the keystore file. To protect the password, store the password in an external location and then use a function to retrieve the password.
credentialStore.<cstore ID>.config.ws.keyPassword Optional. Password to access the certificate within the keystore file. To protect the password, store the password in an external location and then use a function to retrieve the password.
credentialStore.<cstore ID>.config.enforceEntryGroup Optional. Requires Transformer to verify if the user who previews, validates, or starts the pipeline belongs to a group that is permitted to access the secret. When set to true, each secret must have a corresponding
<secret name>-groups
secret that contains a comma-separated list of groups that is permitted to access the secret.For more information, see Group Access to Secrets.
Default is false.
- Save the changes to the deployment and restart all engine instances.
Step 3. Call Secrets from the Pipeline
Specify credential functions in stage or pipeline properties to retrieve secrets stored in CyberArk.
Use the credential functions in any stage property that displays the key icon next to it. For example:
For details about credential functions, see Credential Functions.
Google Secret Manager
To use a Google Secret Manager credential store system, install the Google Secret Manager Credentials Store stage library and define the configuration properties used to connect to Secret Manager. Then, use a credential function in pipeline stage properties to retrieve stored values.
As a best practice, make secrets read-only and limit access. For additional suggestions, see the Google Secret Manager best practices documentation.
Authentication
Transformer must authenticate with Google Secret Manager using Google credentials.
- Default
- Transformer authenticates with Google Secret Manager using the credentials file
defined in the
GOOGLE_APPLICATION_CREDENTIALS
environment variable. - JSON
- Transformer authenticates with Google Secret Manager using JSON-formatted credential information specified in the credential store configuration properties. You copy the JSON content from a Google Cloud service account credentials file.
- JSON Path
- Transformer authenticates with Google Secret Manager using a Google Cloud service account credentials file. Store the file in the same location on the Transformer machine and on each node in the Spark cluster.
For information about generating a service account credential file, see the Google Cloud Platform documentation.
Step 1. Install the Credential Store Stage Library
Before you use Secret Manager, the Google Secret Manager Credential Store stage library must be included in the engine deployment and installed on the engine. The stage library is a common stage library. For details on adding stage libraries to a deployment, see the Control Hub documentation.
Step 2. Configure the Credential Store Properties
To enable Transformer to connect to the Google Secret Manager credential store, configure the Secret Manager properties in the Transformer credential store configuration properties.
- In Control Hub, edit the deployment, and in the Configure Engine section, click Advanced Configuration. Then, click Credential Stores.
- Uncomment the credentialStores
property in the file and specify the credential store ID to use. Use only alphabetic
characters for the credential store ID.
By default, the property lists a default credential store ID for each type of credential store,
aws
for AWS Secrets Manager,azure
for Azure Key Vault, and so on. When using one credential store of any type, it's simplest to use the default value.To use just a single Secret Manager, set the value to
gcp
.To enable multiple credential stores, specify a comma-separated list of credential store IDs. For example, to use a Java keystore and a Secret Manager credential store, set the value to jks,gcp. To use multiple Secret Manager credential stores, simply specify separate IDs for each, such as
gcpDev,gcpProd
. - Uncomment and configure the following properties as
needed.
If you specified a custom credential store ID, update the names of the following properties, and then configure them as needed. When using the default credential store ID,
gcp
, leave the property names intact, and simply configure the properties.To use multiple Google Secret Manager credential stores, make a copy of the properties for each credential store. Then, update the credential store ID in each set of property names before defining the properties. For an example, see Enabling Credential Stores.
Note: Control Hub displays sensitive data such as passwords asREDACTED
after you save the deployment.The properties are grouped in the Google Secret Manager section of the file:
Secret Manager Property Description credentialStore.<cstore ID>.def Required. Defines the implementation of the Google Secret Manager credential store. Do not change the default value.
credentialStore.<cstore ID>.config.cache.inactivityExpiration.millis Expiration time for the cache in milliseconds. Default is 1800000.
credentialStore.<cstore ID>.config.delimiter Delimiter to use in the credential function name
argument to separate the secret name and the version ID. Use a single character that is not included in credential names.Use the following format for the
name
argument:<name><delimiter><version id>
For example, if you use a slash, the format for the name argument is:
<name>/<version id>
Default is question mark (?).
credentialStore.<cstore ID>.config.project.id ID of the project associated with Secret Manager. credentialStore.<cstore ID>.config.credentialsMode Credentials to use for authentication with Secret Manager: default
- Uses Google Cloud default credentials.json
- Uses JSON-formatted credentials information specified in the credential store configuration properties.jsonPath
- Uses a JSON service account credentials file stored in the same location on the Transformer machine and on each node in the Spark cluster.
For more information, see Authentication.
credentialStore.<cstore ID>.config.credentialsJson Contents of a Google Cloud service account credentials JSON file. Enter JSON-formatted credential information in plain text. If the content includes multiple lines of text, add a backslash (\) at the end of each line.
Required when using the
json
credentials mode.credentialStore.<cstore ID>.config.credentialsJsonPath Absolute path to the Google Cloud service account credentials file stored in the same location on the Transformer machine and on each node in the Spark cluster. The credentials file must be a JSON file. Required when using the
jsonPath
credentials mode.credentialStore.<cstore ID>.config.enforceEntryGroup Optional. Requires Transformer to verify if the user who previews, validates, or starts the pipeline belongs to a group that is permitted to access the secret. When set to true, each secret must have a corresponding
<secret name>-groups
secret that contains a comma-separated list of groups that is permitted to access the secret.For more information, see Group Access to Secrets.
Default is false.
- Save the changes to the deployment and restart all engine instances.
Step 3. Call Secrets from the Pipeline
Specify credential functions in stage or pipeline properties to retrieve secrets stored in Google Secret Manager.
You can configure credential functions in any property that displays the key icon next to it. For example:
For details about credential functions, see Credential Functions.
Hashicorp Vault
To use the Hashicorp Vault credential store system, install the Vault Credential Store stage library and define the configuration properties used to connect to Hashicorp Vault. Then, use credential functions in pipeline stage properties to retrieve stored values.
Step 1. Install the Credential Store Stage Library
Before you use Hashicorp Vault, the Vault Credential Store stage library must be included in the engine deployment and installed on the engine. The stage library is a common stage library. For details on adding stage libraries to a deployment, see the Control Hub documentation.
Step 2. Configure the Credential Store Properties
To enable Transformer to connect to the Hashicorp Vault credential store, configure the Hashicorp Vault properties in the Transformer credential store configuration properties.
- In Control Hub, edit the deployment, and in the Configure Engine section, click Advanced Configuration. Then, click Credential Stores.
- Uncomment the credentialStores
property in the file and specify the credential store ID to use. Use only alphabetic
characters for the credential store ID.
By default, the property lists a default credential store ID for each type of credential store,
aws
for AWS Secrets Manager,azure
for Azure Key Vault, and so on. When using one credential store of any type, it's simplest to use the default value.To use just a single Hashicorp Vault credential store, set the value to
vault
.To enable multiple credential stores, specify a comma-separated list of credential store IDs. For example, to use a Java keystore and a Hashicorp Vault credential store, set the value to jks,vault. To use multiple Hashicorp Vault credential stores, simply specify separate IDs for each, such as
vaultDev,vaultProd
. - Uncomment and configure the following properties as
needed.
If you specified a custom credential store ID, update the names of the following properties, and then configure them as needed. When using the default credential store ID,
vault
, leave the property names intact, and simply configure the properties.To use multiple Hashicorp Vault credential stores, make a copy of the properties for each credential store. Then, update the credential store ID in each set of property names before defining the properties. For an example, see Enabling Credential Stores.
Note: Control Hub displays sensitive data such as passwords asREDACTED
after you save the deployment.These properties are grouped in the Hashicorp Vault section of the file:
Secrets Manager Property Description credentialStore.<cstore ID>.def Required. Defines the implementation of the Vault credential store. Do not change the default value.
credentialStore.<cstore ID>.config.pathKey.separator Optional. Separator to use in the name
argument that credential functions use.Use the following format for the
name
argument:<path><separator><key>
For example, if you keep the default ampersand (&), the format for the name argument is:<path>&<key>
credentialStore.<cstore ID>.config.addr Required. Vault server URL entered in the following format: https://<host name>:<port number>
Use HTTPS to avoid unencrypted communication. The Transformer machine and each node in the Spark cluster must have access to this URL.
credentialStore.<cstore ID>.config.role.id Required. Vault Role ID that Transformer uses to authenticate with Vault. The Role ID is configured within Vault by your Vault administrator. The Transformer Vault integration relies on Vault's App Role authentication backend.Important: The App ID authentication backend has been deprecated by Hashicorp and will be removed in a future release. As a result, do not configure the credentialStore.vault.config.app.id property for new installations.credentialStore.<cstore ID>.config.secret.id Required. Vault Secret ID that Transformer uses to authenticate with Vault. The Secret ID is configured within Vault by your Vault administrator. To protect the Secret ID, store the Secret ID in an external location and then use a function to retrieve the Secret ID.
Default uses the
file
function to retrieve the Secret ID from vault-secret-id in the Transformer configuration directory.credentialStore.<cstore ID>.config.version Version of the KV secrets engine used by Vault. Enter 1 or 2. Default is 1.
credentialStore.<cstore ID>.config.lease.renewal.interval.sec Optional. Seconds to wait before checking for leases that need renewal. Default is 60.
credentialStore.<cstore ID>.config.lease.expiration.buffer.sec Optional. Buffer for expiring leases. Transformer renews leases that expire in less than the specified number of seconds. Default is 120.
credentialStore.<cstore ID>.config.open.timeout Optional. Timeout to establish an HTTP connection to Vault in milliseconds. Default is 0 for no limit.
credentialStore.<cstore ID>.config.proxy.address Optional. Proxy URL. Configure to use a proxy to access Vault. credentialStore.<cstore ID>.config.proxy.port Optional. Proxy port. Configure to use a proxy to access Vault. credentialStore.<cstore ID>.config.proxy.username Optional. Proxy username. Configure to use a proxy to access Vault. credentialStore.<cstore ID>.config.proxy.password Optional. Proxy password. Configure to use a proxy to access Vault. To protect the password, store the password in an external location and then use a function to retrieve the password.
credentialStore.<cstore ID>.config.read.timeout Optional. Milliseconds to wait for data before timing out. Default is 0 for no limit.
credentialStore.<cstore ID>.config.ssl.enabled.protocols Optional. SSL/TLS-enabled protocols. Versions TLSv1.2 or later are recommended. Default is TLSv1.2,TLSv1.3.
credentialStore.<cstore ID>.config.ssl.truststore.file Optional. Path to a Java truststore file. Required when using a private CA or certificates not trusted by the Java default truststore. credentialStore.<cstore ID>.config.ssl.truststore.password Optional. Password for the truststore file. To protect the password, store the password in an external location and then use a function to retrieve the password.
credentialStore.<cstore ID>.config.ssl.verify Optional. Whether to verify that the Vault server hostname matches its certificate. Default is true. False is not recommended.
credentialStore.<cstore ID>.config.ssl.timeout Optional. Timeout for the SSL/TLS handshake in milliseconds. Default is 0 for no limit.
credentialStore.<cstore ID>.config.timeout Optional. Timeout to read from Vault in milliseconds, after a connection has been established. Default is 0 for no limit.
credentialStore.<cstore ID>.config.enforceEntryGroup Optional. Requires Transformer to verify if the user who previews, validates, or starts the pipeline belongs to a group that is permitted to access the secret. When set to true, each secret must have a corresponding
<secret name>-groups
secret that contains a comma-separated list of groups that is permitted to access the secret.For more information, see Group Access to Secrets.
Default is false.
- Save the changes to the deployment and restart all engine instances.
Step 3. Call Secrets from the Pipeline
Specify credential functions in stage or pipeline properties to retrieve secrets stored in Hashicorp Vault.
Use the credential functions in any stage property that displays the key icon next to it. For example:
For details about credential functions, see Credential Functions.
Java Keystore
To use the Java keystore credential store system, install the Java keystore credential store stage library and define the configuration properties used to connect to the credential store.
stagelib-cli
jks-credentialstore
command to add secrets to the credential store. Then,
use credential functions in stage or properties to retrieve those secrets.A Java keystore credential storage system requires the distribution of a keystore file, which complicates security. Before using a Java keystore system, decide how the keystore will be distributed and consult with your IT security team to ensure that the system meets IT policies.
Step 1. Install the Credential Store Stage Library
Before you use a Java keystore, the Java Keystore Credential Store stage library must be included in the engine deployment and installed on the engine. The stage library is a common stage library. For details on adding stage libraries to a deployment, see the Control Hub documentation.
Step 2. Configure Credential Store Properties
To enable Transformer to connect to the Java keystore credential store, configure the Java keystore properties in the Transformer credential store configuration properties.
- In Control Hub, edit the deployment, and in the Configure Engine section, click Advanced Configuration. Then, click Credential Stores.
- Uncomment the credentialStores
property in the file and specify the credential store ID to use. Use only alphabetic
characters for the credential store ID.
By default, the property lists a default credential store ID for each type of credential store,
aws
for AWS Secrets Manager,azure
for Azure Key Vault, and so on. When using one credential store of any type, it's simplest to use the default value.To use just a single Java keystore, set the value to
jks
.To enable multiple credential stores, specify a comma-separated list of credential store IDs. For example, to use a Java keystore and a Secrets Manager credential store, set the value to jks,aws. To use multiple Java keystore credential stores, simply specify separate IDs for each, such as
jksDev,jksProd
. - Uncomment and configure the following properties as
needed.
If you specified a custom credential store ID, update the names of the following properties, and then configure them as needed. When using the default credential store ID,
jks
, leave the property names intact, and simply configure the properties.To use multiple Java keystore credential stores, make a copy of the properties for each credential store. Then, update the credential store ID in each set of property names before defining the properties. For an example, see Enabling Credential Stores.
Note: Control Hub displays sensitive data such as passwords asREDACTED
after you save the deployment.These properties are grouped in the Java keystore section of the file:
Java Keystore Property Description credentialStore.<cstore ID>.def Required. Defines the implementation of the Java Keystore credential store. Do not change the default value.
credentialStore.<cstore ID>.config.keystore.type Required. Format of the Java keystore file: - JCEKS
- PKCS12
Default is PKCS12.
credentialStore.<cstore ID>.config.keystore.file Required. Path and name of the Java keystore file. Enter an absolute path to the file, or a path relative to the Transformer configuration directory. Default is jks-credentialStore.pkcs12.
credentialStore.<cstore ID>.config.keystore.storePassword Required. Password that Transformer uses to access the Java keystore file. You must change the default value before using the keystore file.
To protect the password, store the password in an external location and then use a function to retrieve the password.
credentialStore.<cstore ID>.config.keystore.file.min.refresh.millis Milliseconds that Transformer waits before reloading the keystore file. Default is 10000, or ten seconds.
- Save the changes to the deployment and restart all engine instances.
Step 3. Add Secrets to the Java Keystore
Use the stagelib-cli jks-credentialstore
command to define secrets
in the Java keystore file. You can define multiple secrets in the file.
Use the command from the Transformer installation directory as follows:
bin/streamsets stagelib-cli jks-credentialstore add -i <cstore ID> -n <secret name> -c <secret value>
For example, the following command adds a secret named OracleDBPassword
with the value 278yT6u
to the jks
Java keystore
credential store:
bin/streamsets stagelib-cli jks-credentialstore add -i jks -n OracleDBPassword -c 278yT6u
stagelib-cli jks-credentialstore
command also includes
delete
and list
subcommands that you use to
manage the secrets defined in the keystore file. For information on using these
commands, see jks-credentialstore Command. Step 4. Call Secrets from the Pipeline
Specify the credential:get()
function in stage or pipeline
properties to call secrets from a Java keystore.
You can configure a credential function in any property that displays the key icon next to it. For example:
credential:get()
function, see Credential Functions.jks-credentialstore Command
The stagelib-cli jks-credentialstore
command provides subcommands to
add, list, and delete secrets in the Java keystore credential store.
Any changes made to the Java keystore file take effect immediately. For example, if you change the value of an existing secret in the file, running pipelines that require a new connection to the external system use the updated value.
stagelib-cli
jks-credentialstore
command:- add
- Adds a secret to the Java keystore credential store.
- delete
- Deletes a secret from the Java keystore credential store.
- list
- Lists the names of all secrets defined in the Java keystore credential store. The command does not list the secret values.