Creating a pipeline¶
Once the authentication step has been handled and you’ve successfully instantiated a
object, you’re now ready to build a pipeline.
Instantiating a Pipeline Builder¶
The first step of creating a pipeline is to instantiate a
This class handles the majority of the pipeline configuration on your behalf by building the initial JSON representation
of the pipeline, and configuring default values for essential properties (instead of requiring each to be set manually).
streamsets.sdk.sdc_models.PipelineBuilder instance can be created as follows:
pipeline_builder = sdc.get_pipeline_builder()
Adding Stages to the Pipeline Builder¶
Now that the builder has been instantiated, you can get
streamsets.sdk.sdc_models.Stage instances from this
builder for use in the pipeline you’re creating. Adding stages to the pipeline can be done by calling
streamsets.sdk.sdc_models.PipelineBuilder.add_stage(). See the API reference for this method for details on the
arguments it takes.
As shown in the first example, the simplest type of pipeline directs one origin into one
destination. For this example, you can do this with
Dev Raw Data Source origin and
dev_raw_data_source = pipeline_builder.add_stage('Dev Raw Data Source') trash = pipeline_builder.add_stage('Trash')
Connecting the Stages¶
streamsets.sdk.sdc_models.Stage instances in hand, you can connect them by using the
Once the stages are connected, you can build the
streamsets.sdk.sdc_models.Pipeline instance with the
dev_raw_data_source >> trash pipeline = pipeline_builder.build('My first pipeline')