Azure Blob Storage

Available when using an authoring Data Collector version 5.5.0 or later.

To create an Azure Blob Storage connection, the Azure stage library, streamsets-datacollector-azure-lib, must be installed on the selected authoring Data Collector.

For a description of the connection properties, see Azure Blob Storage Connection Properties.

After you create an Azure Blob Storage connection, you can use the connection in the following stages:
Engine Stages

Data Collector 5.5.0 or later

  • Azure Blob Storage origin

Prerequisites

Before configuring an Azure Blob Storage connection, you must ensure that the connection provides read and write access to the needed objects. The connection uses an authentication method to establish its identity with Azure. Azure authorizes an identity to read and write objects. Azure provides various authorization methods for specific types of objects. The methods offer different levels of security. You must choose compatible authentication and authorization methods for your objects. Consult the Microsoft Azure Blob Storage documentation for information about the authorization methods. You must complete appropriate steps for the chosen method, such as configuring an Azure Active Directory application, creating a shared key, or creating a shared access signature (SAS) token.

An Azure Blob Storage connection can use one of the following authentication methods:
OAuth with Service Principal
Connections made with OAuth with Service Principal authentication require the following information:
  • Application ID - Application ID for the Azure Active Directory Data Collector application. Also known as the client ID.

    For information on accessing the application ID from the Azure portal, see the Azure documentation.

  • Tenant ID - Tenant ID for the Azure Active Directory Data Collector application. Also known as the directory ID.

    For information on accessing the tenant ID from the Azure portal, see the Azure documentation.

  • Application Key - Authentication key or client secret for the Azure Active Directory application. Also known as the client secret.

    For information on accessing the application key from the Azure portal, see the Azure documentation.

Azure Managed Identity
Connections made with Azure Managed Identity authentication require the following information:
  • Application ID - Application ID for the Azure Active Directory Data Collector application. Also known as the client ID.

    For information on accessing the application ID from the Azure portal, see the Azure documentation.

Shared Key
Connections made with Shared Key authentication require the following information:
  • Account Shared Key - Shared access key that Azure generated for the storage account.

    For more information on accessing the shared access key from the Azure portal, see the Azure documentation.

SAS Token
Connections made with SAS Token authentication require the following information:
  • Azure SAS Token - Shared access signature (SAS) token that provides secure access to the needed resources in Azure Blob Storage.

    For more information on SAS tokens for storage containers, see the Azure documentation.

Azure Blob Storage Connection Properties

When creating an Azure Blob Storage connection, configure the following properties on the Azure tab:
Azure Property Description
Account FQDN The host name of the Blob Storage account. For example:

<storage account name>.blob.core.windows.net

Storage Container / File System Name of the storage container or file system that contains the data to be read or written.
Authentication Method Authentication method used to connect to Azure:
  • OAuth with Service Principal
  • Azure Managed Identity
  • Shared Key
  • SAS Token
Application ID Application ID for the Azure Active Directory Data Collector application. Also known as the client ID.

For information on accessing the application ID from the Azure portal, see the Azure documentation.

Available when using the OAuth with Service Principal or the Azure Managed Identity authentication method.

Endpoint Type Method to provide endpoint details.

Available when using the OAuth with Service Principal authentication method.

Tenant ID Tenant ID for the Azure Active Directory Data Collector application. Also known as the directory ID.

For information on accessing the tenant ID from the Azure portal, see the Azure documentation.

Available when Endpoint Type is set to Tenant ID.

Endpoint URL Endpoint URL for the Azure Active Directory Data Collector application.

Default is https://login.microsoftonline.com/<tenant-id>/oauth2/token.

In the URL, specify the tenant ID for the Azure Active Directory Data Collector application.

For information on accessing the tenant ID from the Azure portal, see the Azure documentation.

Available when Endpoint Type is set to Endpoint URL.

Application Key Authentication key or client secret for the Azure Active Directory application. Also known as the client secret.

For information on accessing the application key from the Azure portal, see the Azure documentation.

Available when using the OAuth with Service Principal authentication method.

Account Shared Key Shared access key that Azure generated for the storage account.

For more information on accessing the shared access key from the Azure portal, see the Azure documentation.

Available when using the Shared Key authentication method.

Azure SAS Token Shared access signature (SAS) token that provides secure access to the needed resources in Azure Blob Storage.

For more information on SAS tokens for storage containers, see the Azure documentation.

Available when using the SAS Token authentication method.