Connecting Stages#

As described in earlier sections and as shown in the first example, connecting stages together to create the flow of your pipeline is essential to its use.

Output Lanes#

To connect the output lane of one stage to the input lane of another, simply use the >> operator between two streamsets.sdk.sdc_models.Stage instances:

dev_raw_data_source >> trash

For stages with multiple output paths, the >> operator can be used multiple times:

file_tail = builder.add_stage('File Tail')
file_tail >> trash_1
file_tail >> trash_2

It is also possible to connect a stage with a single output path to the inputs of multiple stages.

To accomplish this, the >> operator expects that the streamsets.sdk.sdc_models.Stage instances, to which you’ll be connecting the same output, are put into a list:

trash_1 = builder.add_stage('Trash')
trash_2 = builder.add_stage('Trash')
dev_raw_data_source >> [trash_1, trash_2]

Using the above steps creates a pipeline like the one in the image below:


Event Lanes#

To connect the event lane of one stage to another, use the >= operator:

dev_data_generator >> trash_1
dev_data_generator >= trash_2
