Amazon S3

The Amazon S3 origin reads objects stored in Amazon S3. The object names must share a prefix pattern and should be fully written. To read messages from Amazon SQS, use the Amazon SQS Consumer origin. The Amazon S3 origin can process objects in parallel with multiple threads. For information about supported versions, see Supported Systems and VersionsSupported Systems and Versions in the Data Collector documentation.

With the Amazon S3 origin, you define the region, bucket, prefix pattern, optional common prefix, and read order. These properties determine the objects that the origin processes. You configure the authentication method that the origin uses to connect to Amazon S3. You can optionally include Amazon S3 object metadata in the record as record header attributes.

After processing an object or upon encountering errors, the origin can keep, archive, or delete the object. When archiving, the origin can copy or move the object.

When a stops, the Amazon S3 origin notes where it stops reading. When the starts again, the origin continues processing from where it stopped by default. You can reset the origin reset the origin reset the origin to process all requested objects.

You can configure the origin to decrypt data stored on Amazon S3 with server-side encryption and customer-provided encryption keys. You can optionally use a proxy to connect to Amazon S3. You can also use a connection connection connection to configure the origin.

Note: The origin processes objects based on object key names and locations. Having objects with the same key name in the same location can cause the origin to skip reading the duplicate objects.

The origin can generate events for an event stream. For more information about dataflow triggers and the event framework, see Dataflow Triggers Overview.