Credential Stores

When your organization uses deployed Transformer for Snowflake engines, you can configure Transformer for Snowflake to access sensitive information that is secured in a credential store.

Transformer for Snowflake pipelines that run on deployed engines use connections to connect to Snowflake. When you configure a connection, you specify sensitive information, such as user names and passwords or private keys.

When you enter sensitive information directly in connection properties, users with certain roles, such as Deployment Manager, can access that information from the engine installation directory.

To best secure sensitive information, add the data as secrets to a credential store and then use StreamSets credential functions in credential properties to retrieve those values.

Defining secrets in a credential store can also make it easier to migrate pipelines to another environment. For example, if you migrate multiple pipelines from a development to a production environment, you do not need to edit each pipeline with details for the production environment. You can simply replace the development credential store with the production version.

You can configure Transformer for Snowflake to use multiple credential stores. Each credential store is identified by a unique credential store ID.

You can use the AWS Secrets Manager credential store with Transformer for Snowflake at this time.

Enabling Credential Stores

You can configure Transformer for Snowflake to use one or more credential stores. Each credential store is identified by a unique credential store ID.

You specify the credential stores that Transformer for Snowflake can use in the Transformer for Snowflake credential configuration properties. To enable a credential store, you configure the following information:
credentialStores property
This property defines the credential stores that Transformer for Snowflake can use.
By default, the property is commented out and includes a default credential store ID for the supported credential store types, such as aws for AWS Secrets Manager.
To enable using credential stores, you uncomment this property and enter a comma-separated list of the credential store IDs to use.
You can specify multiple credential stores, such as two AWS credential stores. You simply specify a unique ID for each credential store.
Sets of related properties
Each supported credential store type has a set of related properties. The property names include the default credential store IDs originally specified in the credentialStores property.
For example, the AWS Secrets Manager properties include the default Secrets Manager ID, aws, in each Secrets Manager property name, such as credentialStore.aws.config.region and credentialStore.aws.config.access.key.
When you use a custom credential store ID, you must update all related property names to match the custom ID. For example, if you want to use awsUS as a custom ID, you must update all Secrets Manager default property names for the awsUS credential store replacing aws with awsUS.
Note: To use multiple credential stores of the same type, you must have a set of related store properties that are renamed and defined appropriately for each credential store.

For example, say you want to use two AWS credential stores, awsDev for development and awsProd for production. To do this, you specify the credential store IDs in the credentialStores property and make a copy of the related AWS credential store properties, so you have one set for each credential store.

Then, you rename and configure the properties for awsDev, and you do the same for awsProd. The resulting properties might look as follows, with important changes in bold:
##############################################################
#        Transformer for Snowflake Credential Stores         #
##############################################################

credentialStores=awsDev,awsProd

#credentialStores.usePortableGroups=false

################################################################
# awsDev: AWS Secrets Manager Credential Store Configuration #
################################################################
# The following properties are for an AWS Secrets Manager credential store that uses the 'aws'
# default credential store ID. If you specified a custom ID in the credentialStores property,
# replace 'aws' in the property names with the custom ID.
# Defines the implementation of the 'aws' credential store
# Update 'aws' in the property name as needed, but do not change the definition of this property.
credentialStore.awsDev.def=streamsets-transformer-aws-secrets-manager-credentialstore-lib::com_streamsets_datacollector_credential_aws_secrets_manager_AWSSecretsManagerCredentialStore
# Default name-key separator for the name parameter in credential functions
credentialStore.awsDev.config.nameKey.separator=&
# AWS region
credentialStore.awsDev.config.region=us-west-1
# AWS access key
credentialStore.awsDev.config.access.key=AWSACCESSKEYDEV
# AWS secret key
credentialStore.awsDev.config.secret.key=AWS/CONFIG/SECRETKEYDEV
# Secrets max cache size
# Maximum number of secrets to cache locally
credentialStore.awsDev.config.cache.max.size=1024
# Secrets cache TTL
# The number of milliseconds that a cached secret is considered valid before requiring a refresh
# The default is equivalent to 1 hour
credentialStore.awsDev.config.cache.ttl.millis=3600000
# Requires a group secret for each secret
credentialStore.awsDev.config.enforceEntryGroup=false
################################################################
# awsProd: AWS Secrets Manager Credential Store Configuration #
################################################################
# The following properties are for an AWS Secrets Manager credential store that uses the 'aws'
# default credential store ID. If you specified a custom ID in the credentialStores property,
# replace 'aws' in the property names with the custom ID.
# Defines the implementation of the 'aws' credential store
# Update 'aws' in the property name as needed, but do not change the definition of this property.
credentialStore.awsProd.def=streamsets-transformer-aws-secrets-manager-credentialstore-lib::com_streamsets_datacollector_credential_aws_secrets_manager_AWSSecretsManagerCredentialStore
# Default name-key separator for the name parameter in credential functions
credentialStore.awsProd.config.nameKey.separator=&
# AWS region
credentialStore.awsProd.config.region=us-west-1
# AWS access key
credentialStore.awsProd.config.access.key=AWSACCESSKEY
# AWS secret key
credentialStore.awsProd.config.secret.key=AWS/CONFIG/SECRETKEY
# Secrets max cache size
# Maximum number of secrets to cache locally
credentialStore.awsProd.config.cache.max.size=1024
# Secrets cache TTL
# The number of milliseconds that a cached secret is considered valid before requiring a refresh
# The default is equivalent to 1 hour
credentialStore.awsProd.config.cache.ttl.millis=3600000
# Requires a group secret for each secret
credentialStore.awsProd.config.enforceEntryGroup=false

Group Access to Secrets

As an additional layer of security, you can employ user groups to further limit access to the secrets defined in credential stores.

Transformer for Snowflake provides two methods to limit access with user groups:
Required group argument in credential functions
Credential functions include a group argument that defines the user group that can access the secret. The group argument ensures that the user who attempts to preview, validate, or start a pipeline that includes a credential function belongs to the group specified in the function. The user must also have execute permission on the pipeline.
Specify the group argument using the following naming convention: <group ID>@<organization ID>. For example, devops@9a213-b18-1eb-b9c-15ad68.
Note: The organization ID differs from the organization name. It is an alphanumeric string that you can find by going to Manage > My Organization.
If you do not want to restrict access to a secret, specify the default group using all or all@<organization ID>.

If Transformer for Snowflake shuts down while running a pipeline that uses a credential function, Transformer for Snowflake restarts the pipeline without checking the group access.

Optional group secrets in the credential store

In addition to using the group argument in credential functions, you can configure Transformer for Snowflake to require group secrets for a credential store.

To require the use of group secrets, in the Transformer for Snowflake credential store configuration properties, set the credentialStore.<cstore ID>.config.enforceEntryGroup property to true.

A group secret is a secret defined in the credential store that contains a comma-delimited list of Transformer for Snowflake user groups permitted to access the associated secret.

When the credential store ID requires group secrets, you must define a group secret for every secret that Transformer for Snowflake accesses in that credential store. The name of the group secret is based on the secret name, as follows:
<secret name>-groups
When you configure a credential function to call a secret, the user group specified in the credential function must be listed in the associated group secret that is defined in the credential store.
For example, say you enable Transformer for Snowflake to require group secrets for AWS Secrets Manager. Then, when specifying the access key for the Snowflake connection, you use the following expression:
${credential:get("aws", "awsprod@9a213-b18-1eb-b9c-15ad68", accesskeyprod)}
When you run the pipeline, Transformer for Snowflake validates all of the following:
  • The user who starts the pipeline is in the awsprod user group.
  • The accesskeyprod secret has an associated accesskeyprod-groups group secret defined in the credential store.
  • The accesskeyprod-groups group secret includes the awsprod user group.

When Transformer for Snowflake is not configured to require group secrets, Transformer for Snowflake validates only the first point, verifying that the user belongs to the specified group.

Sensitive Data in Credential Store Properties

When you define credential store properties, you must enter some sensitive data such as passwords to authenticate with the credential store system. For example, to use the AWS Secrets Manager credential store system, you enter the AWS access key ID and secret access key.

To prevent exposing the sensitive data in the Control Hub deployment details, Control Hub displays the sensitive values as REDACTED after you save the deployment.

Alternatively, instead of entering sensitive data in the configuration properties, you can protect the sensitive data by storing the data in an external location and then using functions to retrieve the data. When you use functions in credential store properties, Control Hub does not redact the credential store property values. Control Hub displays the defined functions after you save the deployment.

AWS Secrets Manager

To use the AWS Secrets Manager credential store system, define the configuration properties used to connect to Secrets Manager. Then, use credential functions in pipeline or stage properties to retrieve stored values.

In Secrets Manager, you must configure an access and secret key pair with correct permission to read the key. To follow best practices, make secrets read-only and limit access. See the Secrets Manager documentation on identity and access management (IAM) policies.

Note: This documentation includes Secrets Manager information needed for the configuration process. For more information, see the AWS Secrets Manager documentation.

Step 1. Configure the Credential Store Properties

To enable Transformer for Snowflake to connect to the AWS Secrets Manager credential store, configure the Secrets Manager properties in the Transformer for Snowflake credential store configuration properties.

  1. In Control Hub, edit the deployment, and in the Configure Engine section, click Advanced Configuration. Then, click Credential Stores.
  2. Uncomment the credentialStores property in the file and specify the credential store ID to use. Use only alphabetic characters for the credential store ID.

    By default, the property lists a default credential store ID: aws for AWS Secrets Manager. You can use the default when using just a single Secrets Manager.

    To enable multiple credential stores, specify a comma-separated list of credential store IDs. For example, to use multiple Secret Manager credential stores, simply specify separate IDs for each, such as awsDev,awsProd.

  3. Uncomment and configure the following properties as needed.

    If you specified a custom credential store ID, update the names of the following properties, replacing aws with the custom ID. When using the default credential store ID, leave the property names as they are.

    To use multiple AWS Secrets Manager credential stores, make a copy of the properties for each credential store. Then, update the credential store ID in each set of property names before defining the properties. For an example, see Enabling Credential Stores.
    Note: Control Hub displays sensitive data such as passwords as REDACTED after you save the deployment.

    These properties are grouped in the AWS Secrets Manager section of the file:

    Secrets Manager Property Description
    credentialStore.<cstore ID>.def Required. Defines the implementation of the AWS Secrets Manager credential store.

    Do not change the default value.

    credentialStore.<cstore ID>.config.nameKey.separator Optional. Separator to use in the name argument for credential functions.
    Note: In Secrets Manager, names can contain alphanumeric and the following special characters: / _ + = . @ - . Therefore, avoid using those characters as separators.
    credentialStore.<cstore ID>.config.region Required. AWS region that hosts Secrets Manager. For a list of available regions, see the AWS Region Table.
    credentialStore.<cstore ID>.config.access.key Required when using access keys to authenticate with AWS. AWS access key.
    credentialStore.<cstore ID>.config.secret.key Required when using access keys to authenticate with AWS. AWS secret key.
    credentialStore.<cstore ID>.config.cache.max.size Optional. Maximum number of secrets Transformer for Snowflake can cache locally. Default is 1024.
    credentialStore.<cstore ID>.config.cache.ttl.millis Optional. Number of milliseconds that Transformer considers a cached secret valid before requiring a refresh. Default is 1 hour.
    credentialStore.<cstore ID>.config.enforceEntryGroup Optional. Requires Transformer for Snowflake to verify if a user who previews, validates, or starts the pipeline belongs to the group that is permitted to access the secret.

    When set to true, each secret must have the corresponding <secret name>-groups secret that contains a comma-separated list of groups that is permitted to access the secret.

    For more information, see <Group Access to Secrets>.

    Default is false.

  4. Save the changes to the deployment and restart all engine instances.

Step 2. Call Secrets from the Pipeline

Specify credential functions in connection properties to retrieve secrets stored in AWS Secrets Manager.

Use the credential functions in any property that displays the key icon next to it. For example, you can use credential functions in the User, Password, and Role properties below:

When you use a credential function in a connection property, the function must be the only value defined in the property. For details about credential functions, see Credential Functions.

Credential Functions

Credential functions provide access to sensitive information, such as user names and passwords, that is secured in a credential store. Use credential functions in connection properties to enable Transformer for Snowflake to access external systems without exposing those values.

Before you use a credential function, you must configure Transformer for Snowflake to use one of the supported credential stores.

You can use credential functions in any property that displays a key icon next to the property name. For example:

Important: When you use a credential function in a property, the function must be the only value defined in the property.

You can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

The StreamSets expression language provides the following credential functions:
credential:get(<cstoreId>, <userGroup>, <name>)
Returns the secret from the credential store. Uses the following arguments:
  • cstoreId - Unique ID of the credential store to use. Use the ID specified in the Transformer for Snowflake credential store configuration properties. For more information, see Enabling Credential Stores.
  • userGroup - Group that a user must belong to in order to access the secret. Only users that have execute permission on the pipeline and that belong to this group can validate, preview, or run the pipeline that retrieves the secret.
    Specify the group using the required naming convention: <group ID>@<organization ID>.
    Note: The organization ID differs from the organization name. It is an alphanumeric string that you can find by going to Manage > My Organization.

    To grant access to all users, specify the default group using all or all@<organization ID>.

  • name - Name of the secret to retrieve from the credential store. Use the required format for the credential store:
    • AWS Secrets Manager - Enter the name of the secret to retrieve from Secrets Manager. Use the following format: "<name><separator><key>", where:
      • <name> is the name of the secret in Secrets Manager to read.
      • <separator> is the separator defined in the Transformer for Snowflake credential store configuration properties.
      • <key> is the key for the value that you want returned.
Return type: String.
AWS Secrets Manager example: The following expression returns the value from the key SQLk1 of the secret SQLpassword from the awsdev credential store. Note that the expression uses an ampersand (&) as the separator argument because that is how the separator is defined in the Transformer for Snowflake credential store configuration properties. The expression allows any user in the devops group to access the key when validating, previewing, or running the pipeline:
${credential:get("awsdev", "devops@9a213-b18-1eb-b9c-15ad68", "SQLpassword&SQLk1")}
credential:getWithOptions(<cstoreId>, <userGroup>, <name>, <storeOptions>)
Returns the secret from the credential store using additional options to communicate with the credential store.
For example, you might use this function with AWS Secrets Manager to specify a different separator character to use.
Uses the following arguments:
  • cstoreId - Unique ID of the credential store to use. Use the ID specified in the Transformer for Snowflake credential store configuration properties. For more information, see Enabling Credential Stores.
  • userGroup - Group that a user must belong to in order to access the secret. Only users that have execute permission on the pipeline and that belong to this group can validate, preview, or run the pipeline that retrieves the secret.
    Specify the group using the required naming convention: <group ID>@<organization ID>.
    Note: The organization ID differs from the organization name. It is an alphanumeric string that you can find by going to Manage > My Organization.

    To grant access to all users, specify the default group using all or all@<organization ID>.

  • name - Name of the secret to retrieve from the credential store. Use the required format for the credential store:
    • AWS Secrets Manager - Enter the name of the secret to retrieve from Secrets Manager. Use the following format: "<name><separator><key>", where:
      • <name> is the name of the secret in Secrets Manager to read.
      • <separator> is the separator defined in either the Transformer for Snowflake credential store configuration properties or using the separator option, below.
      • <key> is the key for the value that you want returned.
  • storeOptions - Additional options to communicate with the credential store.
    For AWS Secrets Manager, you can use the following options to override several properties in the Transformer for Snowflake credential store configuration properties:
    • separator - Specifies the separator for name and key values in the credential functions, overriding the credentialStore.<cstore ID>.config.nameKey.separator property.
    • alwaysRefresh - When set to true, forces the key to refresh its cached value before Transformer for Snowflake retrieves the value, overriding the credentialStore.<cstore ID>.config.cache.ttl.millis property. Be aware that always refreshing the cached value significantly increases the pipeline run time.
Return type: String.
AWS Secrets Manager example: The following expression returns the value from the key SQLk1 of the secret SQLpassword from the awsdev credential store, overriding the separator defined in the Transformer for Snowflake credential store configuration properties with a pipe ( | ). The expression allows any user in the devops group to access the key when validating, previewing, or running the pipeline:
${credential:getWithOptions("awsdev", "devops@9a213-b18-1eb-b9c-15ad68", "SQLpassword|SQLk1", "separator=|")}