Overview

An origin stage represents the source for the pipeline. You can use a single origin stage in a pipeline. Or, you can use multiple origin stages in a pipeline and then join the origins using a Join or Union processor.

You can use the following origins in a Transformer pipeline:

ADLS Gen2 - Reads files in Azure Data Lake Storage Gen2.
Amazon S3 - Reads objects in Amazon S3.
Azure Event Hubs - Reads events from Azure Event Hubs.
Delta Lake - Reads data from a Delta Lake table.
File - Reads files in HDFS or local file systems.
Google Big Query - Reads data from a BigQuery table.
Hive - Read data from a Hive table.
JDBC Query - Reads data from a database through JDBC using a specified query.
JDBC Table - Reads data from a database table using a JDBC driver.
Kafka - Reads data from topics in an Apache Kafka cluster.
Kudu - Reads data from a Kudu table.
MySQL JDBC Table - Reads data from a MySQL table.
Oracle JDBC Table - Reads data from an Oracle table.
PostgreSQL JDBC Table - Reads data from a PostgreSQL table.
Snowflake - Reads data from a Snowflake database.
SQL Server JDBC - Reads data from a Microsoft SQL Server table.
Unity Catalog - Reads data from a Databricks Unity Catalog table.
Whole Directory - Reads all files within a directory in a single batch.