SAML Authentication Overview

The StreamSets platform supports single sign-on (SSO) authentication with SAML 2.0 with selected identity providers (IdPs).

Enabling SAML authentication involves registering StreamSets as a service provider in one of the supported IdPs and configuring SAML authentication for your Control Hub organization.

To configure a supported IdP, you upload the StreamSets SAML metadata file or manually configure the required StreamSets properties in the IdP. You can optionally upload the service provider certificate that StreamSets generates for your organization to the IdP, based on whether you want to sign SAML requests or have SAML response assertions encrypted. You also configure the SAML attributes that the IdP passes to StreamSets.

To configure the Control Hub organization, a user with the Organization Administrator role logs in to StreamSets using local or public identity provider authentication. The organization administrator sets up a draft SAML configuration for the organization. After testing the draft SAML configuration, the organization administrator publishes the configuration to production and then enables the configuration to activate it.

Once SAML authentication is enabled, all organization users must log in using SAML authentication. StreamSets supports both identity provider initiated (IdP-initiated) logins and service provider initiated (SP-initiated) logins.

After enabling SAML authentication, you can optionally configure the provisioning of user accounts from the IdP to StreamSets.

Supported Identity Providers

StreamSets supports the following SAML identity providers:
  • Microsoft Active Directory Federation Services (AD FS) - tested on Windows Server 2019
  • Microsoft Entra ID (previously known as Azure AD)
  • Okta
  • PingFederate

IdP and SP-initiated Logins

StreamSets supports both identity provider initiated (IdP-initiated) logins and service provider initiated (SP-initiated) logins. A user with the Organization Administrator role can optionally disable SP-initiated logins.

IdP-initiated Logins

For IdP-initiated logins, users log in to the IdP login page or dashboard, and then click an icon to log in to StreamSets. The IdP authenticates the user, and then sends a SAML AuthnResponse message to the StreamSets Assertion Consumer Service (ACS) endpoint.

For example, if using Okta as the IdP, you log in to Okta and click the StreamSets app integration icon from your Okta dashboard, as follows:

SP-initiated Logins

For SP-initiated logins, users log in directly to StreamSets. StreamSets initiates authentication requests by sending SAML AuthnRequest messages to the IdP SSO endpoint. After the IdP authenticates the user's identity, the user is logged into StreamSets.

By default, SP-initiated logins are enabled for each Control Hub organization. The organization administrator can disable SP-initiated logins, requiring IdP-initiated logins.

For example, when SP-initiated logins are enabled for your organization, you can log in using the Sign in with SSO SAML button in the StreamSets platform log in page, as follows:

Service Provider Certificates

StreamSets generates a unique service provider certificate for your organization.

Upload the certificate to your IdP for the following reasons:
  • When you want StreamSets to sign SAML requests for SP-initiated logins.

    Requires that SP-initiated logins are enabled for your Control Hub organization.

  • When you want the IdP to encrypt SAML response assertions.

    Requires that the Require Encryption on Assertion property is enabled for your Control Hub organization and that the IdP is configured to encrypt the SAML assertion.

As a best practice, StreamSets recommends uploading the StreamSets certificate to your IdP. However, if you do not want to use either functionality, you can skip uploading the certificate.

Note: If you download the StreamSets SAML metadata file from Control Hub when the Require Encryption on Assertion property is enabled, then the metadata file automatically includes the StreamSets certificate.

Each service provider certificate has an expiration date. You can create a new certificate and then rotate the certificate in your IdP when the expiration date approaches.

Request and Response Validation

StreamSets validates the SAML requests and responses based on the following configurations:

StreamSets certificate is uploaded to IdP, SP-initiated login is enabled

Encryption on assertion must be enabled for the Control Hub organization.

SAML requests sent by StreamSets for SP-initiated logins are signed.
SAML responses sent by the IdP must be signed at the response level or both the response and assertion levels. The assertion in the response can be encrypted, based on additional configurations in the IdP.
StreamSets certificate is uploaded to IdP, SP-initiated login is disabled
Encryption on assertion must be enabled for the Control Hub organization.
StreamSets does not send SAML requests to the IdP.
SAML responses sent by the IdP must be signed at the response level or both the response and assertion levels. The assertion in the response must be encrypted.
No StreamSets certificate in the IdP, SP-initiated login is enabled
Encryption on assertion can be enabled or disabled for the Control Hub organization.
SAML requests sent by StreamSets for SP-initiated logins are not signed.
SAML responses sent by the IdP must be signed at the response level, assertion level, or both. The assertion in the response should not be encrypted.
No StreamSets certificate in the IdP, SP-initiated login is disabled
Encryption on assertion must be disabled for the Control Hub organization.
StreamSets does not send SAML requests to the IdP.
SAML responses sent by the IdP must be signed at the response level, assertion level, or both. The assertion in the response cannot be encrypted.

SCIM Provisioning of User Accounts

After enabling SAML authentication, you can optionally configure the provisioning of user accounts from the identity provider (IdP) to StreamSets.

Note: At this time, you can configure SCIM provisioning when using Microsoft Entra ID (previously known as Azure AD) as your IdP. Provisioning support for additional IdPs will be provided in future releases.

StreamSets supports System for Cross-domain Identity Management (SCIM) 2.0. SCIM is a standard implemented within the supported IdPs that allows for the automatic provisioning of user accounts from the IdP to a service provider. When a user or group is created, updated, or deleted in the IdP, the same changes are automatically made in the service provider, StreamSets in this case.

SCIM provisioning simplifies user and group management. Instead of manually creating and updating IdP users and groups in StreamSets, you can configure SCIM provisioning once. The IdP provisioning process automatically performs an initial synchronization of users and groups to StreamSets, and then performs incremental synchronizations at regular intervals so that updates made to users and groups in the IdP are also made to StreamSets.
Important: You cannot synchronize an IdP group named all to StreamSets. Each StreamSets organization provides a default group named all that includes every user in the organization.

After you configure SCIM provisioning for your organization, you cannot use Control Hub to invite users, create groups, or update user and group details. Instead, you create and update users and groups within the IdP.

You must use Control Hub to assign roles and permissions to the provisioned users and groups.

Default Roles for Provisioned Users

You can define a set of default roles that are automatically assigned to each newly provisioned user. For example, you might want to define default role assignments that permit general tasks completed by all users. You can then change the role assignments as needed to secure the integrity of your organization and data.

To define default roles for newly provisioned users, assign a set of roles to the Control Hub all group. The all group includes every user in the organization, so all users inherit roles from this group.

Note: You cannot define a set of default roles that are automatically assigned to each newly provisioned group. After the initial provisioning of groups, you must assign roles to the groups.

For a description of each role, see Roles.

IdP Attribute Mappings

The IdP uses attribute mappings to pass user attributes, such as user email address, first name, and last name, to StreamSets. You configure attribute mappings in the IdP when you enable SAML authentication and also when you enable SCIM provisioning.

Note: At this time, you can configure SCIM provisioning when using Microsoft Entra ID (previously known as Azure AD) as your IdP. Provisioning support for additional IdPs will be provided in future releases.
StreamSets uses the different types of attribute mappings as follows:
SAML attribute mappings
SAML attributes are used to authenticate the user. The IdP must pass the user email address to StreamSets. You can optionally configure attribute mappings for the user display name, first name, and last name.
When you configure the user name attribute mappings, StreamSets uses the mappings for new users invited to the organization after SAML authentication is enabled. StreamSets does not use the mappings to update user name values for existing users that joined the organization before SAML authentication is enabled. If the SAML attribute values change in the IdP, StreamSets does not update the values for existing users.

If you do not configure the user name attribute mappings, StreamSets uses the first portion of the email address as the user display name. Users can change their display name in their Control Hub account settings.

By default, StreamSets maps the incoming IdP attributes to the following StreamSets properties:
Incoming IdP Attribute Name StreamSets Property
Attribute.displayName IdP User Display Name
Subject.NameID IdP User Email
Attribute.firstName IdP User First Name
Attribute.lastName IdP User Last Name
When you configure SAML attribute mappings in the IdP, StreamSets recommends mapping the user values to these IdP attribute names. If you map the user values to different IdP attribute names, then you also must update the IdP attribute names when you set up a draft SAML configuration in StreamSets Control Hub.

For example, if you configure your IdP to pass the user first name to StreamSets in an attribute named first, then you also must modify the IdP User First Name property in Control Hub to Attribute.first.

SCIM attribute mappings
SCIM attributes are used to define user account details. The IdP must pass the user email address, display name, first name, and last name to StreamSets, in addition to other attributes as required by the IdP.
StreamSets uses the SCIM mappings for all provisioned user accounts. If the SCIM attribute values change in the IdP, the provisioning process automatically updates those values for existing users.
When you configure SCIM attribute mappings in the IdP, you must map the user values to the attribute names as described in Step 5. Configure SCIM Provisioning.
If you define the optional user name attribute mappings when you enable SAML authentication and you also enable SCIM provisioning, the SCIM attribute mappings take precedence.

Draft and Production Configurations

You use the following types of configurations to enable SAML for your Control Hub organization:

Draft configuration

To update the SAML settings, modify the draft configuration. When SP-initiated logins are enabled, you can test that the draft SAML configuration is set up correctly with your identity provider.

After completing the draft configuration, you publish it to production.

For more information about testing, resetting, and publishing the draft configuration, see Managing SAML Authentication.

Production configuration
The production configuration is read-only. To edit the SAML settings, you must edit the draft configuration and then publish that draft.
After you publish a draft configuration to production, you enable the production configuration to activate it. When the SAML production configuration is enabled, all organization users must log in using SAML authentication.
You can disable the production configuration to disable SAML authentication for your organization.
For more information about enabling and disabling the production configuration, see Managing SAML Authentication.

For example, the following image shows that the SAML production configuration is read-only and is currently disabled: