Azure Data Lake Storage Gen1 (deprecated)

Data Collector

The Azure Data Lake Storage Gen1 destination writes data to Microsoft Azure Data Lake Storage Gen1. You can use the Azure Data Lake Storage Gen1 destination in standalone and cluster batch pipelines. For information about supported versions, see Supported Systems and VersionsSupported Systems and Versions in the Data Collector documentation.

Important: This stage is deprecated and may be removed in a future release.

To write to Azure Data Lake Storage Gen2, use the Azure Data Lake Storage Gen2 destination.

Before you use the destination, you must perform some prerequisite tasks.

When you configure the Azure Data Lake Storage Gen1 destination, you specify information to connect to Azure. You define a directory template and time basis to determine the output directories that the destination creates and the files where records are written.

You can define a file prefix and suffix, the data time zone, and properties that define when the destination closes a file. You can specify the amount of time that a record can be written to its associated directory and what happens to late records.

When desired, you can write records, use the defined Avro schema, and roll files based on record header attributes. For more information, see Record Header Attributes for Record-Based Writes.

You can use Gzip, Bzip2, Snappy, LZ4, and other compression formats to write output files.

The destination can generate events for an event stream. For more information about the event framework, see Dataflow Triggers Overview.