What is StreamSets DataOps Platform?
StreamSets DataOps PlatformTM is a cloud-native platform for building, running, and monitoring data pipelines.
A pipeline describes the flow of data from origin to destination systems and defines how to process the data along the way. Pipelines can access multiple types of external systems, including cloud data lakes, cloud data warehouses, and storage systems installed on-premises such as relational databases.
As a pipeline runs, you can view real-time statistics and error information about the data as it flows from origin to destination systems.
StreamSets DataOps Platform uses the following components to manage your pipelines:
- StreamSets Control Hub
- StreamSets Control Hub is a public cloud service hosted by StreamSets, which you access using a web browser. Use Control Hub to build, manage, and monitor your pipelines.
- StreamSets engines
- StreamSets engines reside in your corporate network, which can be on-premises or on a protected cloud computing platform. The engines function as headless engines without a UI. StreamSets has two data plane engines, Data Collector and Transformer. Both engines can be deployed independently, but managed together in Control Hub.
The following image provides an overview of the StreamSets DataOps Platform components: