Google Cloud Storage

The Google Cloud Storage origin reads objects stored in Google Cloud Storage. The objects must be fully written and reside in a single bucket. The object names must share a prefix pattern. For information about supported versions, see Supported Systems and VersionsSupported Systems and Versions in the Data Collector documentation.

With the Google Cloud Storage origin, you define the bucket, prefix pattern, and optional common prefix. These properties determine the objects that the origin processes.

You also define the project ID and credentials to use when connecting to Google Cloud Storage. You can also use a connectionconnectionconnection to configure the origin.

After processing an object or upon encountering errors, the origin can keep, archive, or delete the object. When archiving, the origin can copy or move the object.

When the stops, the Google Cloud Storage origin notes where it stops reading. When the starts again, the origin continues processing from where it stopped by default. You can reset the originreset the originreset the origin to process all requested objects.
Note: The origin processes objects based on object names and locations. Having objects with the same name in the same location can cause the origin to skip reading the duplicate objects.

The origin can generate events for an event stream. For more information about dataflow triggers and the event framework, see Dataflow Triggers Overview.