Start Pipelines (deprecated)
Supported pipeline types:
|
The Start Pipelines origin can start pipelines that run on any StreamSets execution engine, such as Data Collector, Data Collector Edge, or Transformer. The origin can start pipelines that run on the execution engine specified in the stage. To start pipelines on a different execution engine, you can use a Start Pipelines processor.
The origin generates a record that contains a list of the started pipelines and information about those pipelines. You can pass the record to an orchestration stage to trigger another task. Or, you can pass it to a non-orchestration stage to perform other processing. For example, you might use a Stream Selector processor to pass the record to different stages based on a pipeline completion status.
When you configure the Start Pipelines origin, you define the URL of the execution engine that runs the pipelines. You specify the names or IDs of the pipelines to start along with any runtime parameters to use. For an execution engine registered with Control Hub, you specify the Control Hub URL, so the origin starts the pipelines through Control Hub.
You can configure the origin to reset the origins in the pipelines when possible, and to run the pipelines in the background. When running pipelines in the background, the origin immediately passes its generated record downstream instead of waiting for the pipelines to finish.
You also configure the user name and password to run the pipeline and can optionally configure SSL/TLS properties.
Pipeline Execution and Data Flow
The Start Pipelines origin starts the specified pipelines when the pipeline starts. The origin creates a record with task details and passes it downstream based on how the pipelines run:
- Run pipelines in the foreground
- By default, the origin starts pipelines that run in the foreground. When the pipelines run in the foreground, the origin passes the orchestration record downstream after all the started pipelines complete.
- Run pipelines in the background
- You can configure the origin to start pipelines that run in the background. When pipelines run in the background, the origin updates and passes the orchestration record downstream immediately after starting the pipelines.
Generated Record
The Start Pipelines origin creates an orchestration record that includes information about the pipelines that it starts.
Field Name | Description |
---|---|
orchestratorTasks | List Map field that contains task details for the orchestration pipeline.
Most orchestration stages add details about their completed tasks within this field. |
<unique task name> | List Map field within the orchestratorTasks field that
contains the following fields:
|
<pipeline ID> | List Map field within the pipelineResults field that provides
details about each pipeline. Contains the following fields:
|
For example, the following preview shows information provided by a Start Pipelines origin
with the Load_ADLS
task name. The origin runs one pipeline in the
background:
Note that the startedSuccessfully
and pipelineStatus
fields indicate that pipeline was started successfully. There is no
finishedSuccessfully
field because the pipeline has not yet
completed.
For an example of an orchestration record, see Example.