Azure Data Lake Storage (Legacy) (deprecated)

Data Collector

The Azure Data Lake Storage (Legacy) destination writes data to Microsoft Azure Data Lake Storage Gen1.

Important: This stage is deprecated and may be removed in a future release.

You can use the Azure Data Lake Storage (Legacy) destination in standalone and cluster batch pipelines. The destination supports connecting to Azure Data Lake Storage Gen1 using Azure Active Directory service principal authentication. To use Azure Active Directory refresh-token authentication or to use the destination in a cluster streaming pipeline, use the Hadoop FS destination.

Before you use the destination, you must perform some prerequisite tasks.

When you configure the Azure Data Lake Storage (Legacy) destination, you specify connection information such as the Application ID and fully qualified domain name (FQDN) for the account.

You can define a directory template and time basis to determine the output directories that the destination creates and the files where records are written. You can also define a file prefix and suffix, the data time zone, and properties that define when the destination closes a file.

Alternatively, you can write records to the specified directory, use the defined Avro schema, and roll files based on record header attributes. For more information, see Record Header Attributes for Record-Based Writes.

The destination can also generate events for an event stream. For more information about the event framework, see Dataflow Triggers Overview.