Overview

You build pipelines in Control Hub to define how data flows from origin to destination systems and how the data is processed along the way.

You can create Data Collector, Transformer, or Transformer for Snowflake pipelines. The pipeline types differ in the available functionality and in how the engine runs the pipeline. For more information, see Comparing StreamSets Pipelines and "Comparing Snowflake and Other StreamSets Engines" in the Transformer for Snowflake documentation.

When you create a Data Collector or Transformer pipeline, you select the authoring engine to use for pipeline design. The selected authoring engine determines the stages and functionality that display in the pipeline canvas. For example, the pipeline canvas allows you to add Google stages to the pipeline only when the selected authoring engine has the Google stage library installed. Transformer for Snowflake pipelines do not require an authoring engine.

Control Hub provides release management for your pipelines and pipeline fragments by maintaining a version history of each pipeline or fragment. You can compare one pipeline or fragment version with another version. You can also add tags to pipeline and fragment versions to easily differentiate between versions.

In the Pipelines view, you manage all StreamSets pipelines the same way. For example, you use the same steps to create, duplicate, filter, publish, and delete all pipeline types.

As you build pipelines and fragments, you can use data preview to see how source data passes through the pipeline or fragment.

For details about building pipelines in the canvas, including how to configure individual pipeline stages, see the appropriate engine documentation for the pipeline type:

Working with Pipelines

The Pipelines view lists all pipelines that you have access to.

You can complete the following tasks in the Pipelines view:

  • Create, configure, and publish a pipeline.
  • Search for pipelines.
  • View pipeline details, including the user who committed the pipeline and the commit time and message.
  • View the configuration of each pipeline and each stage in the pipelines.
  • Duplicate a pipeline.
  • Import pipelines that have been exported from Control Hub.
  • Export pipelines.
  • Create a template from a pipeline.
  • Create a job for a pipeline.
  • View all jobs that include a specific pipeline version.
  • Share a pipeline with other users and groups.
  • Delete a pipeline or pipeline version.

The following image shows the available pipelines and displays the details of one of the pipelines:

Note the following icons that display for all pipelines or when you hover over a single pipeline. You'll use these icons frequently as you manage pipelines:

Icon Name Description
Data Collector Engine Type Displayed for Data Collector pipelines.
Transformer Engine Type Displayed for Transformer pipelines.
Transformer for Snowflake Engine Type Displayed for Transformer for Snowflake pipelines.
Create New Pipeline Create a new pipeline.
Import Import pipelines that have been exported from Control Hub.
Refresh Refresh the list of pipelines.
Duplicate Pipeline Duplicate the selected pipeline.
Compare with Previous Version Compare a pipeline version with a previous version.
Create Job Create a job for the pipeline.
Export Export the selected pipelines.
Share Share the pipeline with other users and groups, as described in Permissions.
Delete Delete the pipeline or pipeline version.