Connecting Stages#


As described in earlier sections and as shown in the first example, connecting stages together to create the flow of your pipeline is essential to its use.

Output Lanes#

To connect the output lane of one stage to the input lane of another, simply use the >> operator between two streamsets.sdk.sdc_models.Stage instances:

dev_raw_data_source >> trash

For stages with multiple output paths, the >> operator can be used multiple times:

file_tail = builder.add_stage('File Tail')
file_tail >> trash_1
file_tail >> trash_2
../../_images/file_tail_to_two_trashes.png

It is also possible to connect a stage with a single output path to the inputs of multiple stages.

To accomplish this, the >> operator expects that the streamsets.sdk.sdc_models.Stage instances, to which you’ll be connecting the same output, are put into a list:

trash_1 = builder.add_stage('Trash')
trash_2 = builder.add_stage('Trash')
dev_raw_data_source >> [trash_1, trash_2]

Using the above steps creates a pipeline like the one in the image below:

../../_images/dev_data_generator_to_two_trashes.png

Event Lanes#

To connect the event lane of one stage to another, use the >= operator:

dev_data_generator >> trash_1
dev_data_generator >= trash_2

../../_images/dev_data_generator_with_events.png