Connecting Stages
Section Contents
Connecting Stages#
As described in earlier sections and as shown in the first example, connecting stages together to create the flow of your pipeline is essential to its use.
Output Lanes#
To connect the output lane of one stage to the input lane of another, simply use the >>
operator between two
streamsets.sdk.sdc_models.Stage
instances:
dev_raw_data_source >> trash
For stages with multiple output paths, the >>
operator can be used multiple times:
file_tail = builder.add_stage('File Tail')
file_tail >> trash_1
file_tail >> trash_2
It is also possible to connect a stage with a single output path to the inputs of multiple stages.
To accomplish this, the >>
operator expects that the streamsets.sdk.sdc_models.Stage
instances, to which
you’ll be connecting the same output, are put into a list:
trash_1 = builder.add_stage('Trash')
trash_2 = builder.add_stage('Trash')
dev_raw_data_source >> [trash_1, trash_2]
Using the above steps creates a pipeline like the one in the image below:
Event Lanes#
To connect the event lane of one stage to another, use the >=
operator:
dev_data_generator >> trash_1
dev_data_generator >= trash_2