Stage Libraries

A Control Hub deployment defines the stage libraries that are installed on all engine instances managed by the deployment. When you configure any deployment type, you select the stage libraries to install on the engine.

Important: You must perform additional steps to install the MapR stage libraries, as described in MapR Prerequisites.

Common Stage Libraries

Common stage libraries include stages that are the most commonly used.

The following table describes the stages installed with each common stage library:
Stage Library Name Included Stages
streamsets-datacollector-aerospike-client-lib For Aerospike version 6.x.

Includes the Aerospike Client destination.

streamsets-datacollector-apache-kafka_1_0-lib For Kafka version 1.0.x.
Includes:
  • Kafka Multitopic Consumer origin
  • Kafka Producer destination
streamsets-datacollector-apache-kafka_1_1-lib For Kafka version 1.1.x.
Includes:
  • Kafka Multitopic Consumer origin
  • Kafka Producer destination
streamsets-datacollector-apache-kafka_2_0-lib For Kafka version 2.0.x.
Includes:
  • Kafka Multitopic Consumer origin
  • Kafka Producer destination
streamsets-datacollector-apache-kafka_2_1-lib For Kafka version 2.1.x.
Includes:
  • Kafka Multitopic Consumer origin
  • Kafka Producer destination
streamsets-datacollector-apache-kafka_2_2-lib For Kafka version 2.2.x.
Includes:
  • Kafka Multitopic Consumer origin
  • Kafka Producer destination
streamsets-datacollector-apache-kafka_2_3-lib For Kafka version 2.3.x.
Includes:
  • Kafka Multitopic Consumer origin
  • Kafka Producer destination
streamsets-datacollector-apache-kafka_2_4-lib For Kafka version 2.4.x.
Includes:
  • Kafka Multitopic Consumer origin
  • Kafka Producer destination
streamsets-datacollector-apache-kafka_2_5-lib For Kafka version 2.5.x.
Includes:
  • Kafka Multitopic Consumer origin
  • Kafka Producer destination
streamsets-datacollector-apache-kafka_2_6-lib For Kafka version 2.6.x.
Includes:
  • Kafka Multitopic Consumer origin
  • Kafka Producer destination
streamsets-datacollector-apache-kafka_2_7-lib For Kafka version 2.7.x.
Includes:
  • Kafka Multitopic Consumer origin
  • Kafka Producer destination
streamsets-datacollector-apache-kafka_2_8-lib For Kafka version 2.8.x.
Includes:
  • Kafka Multitopic Consumer origin
  • Kafka Producer destination
streamsets-datacollector-apache-kafka_3_0-lib For Kafka version 3.0.x.
Includes:
  • Kafka Multitopic Consumer origin
  • Kafka Producer destination
streamsets-datacollector-apache-kafka_3_1-lib For Kafka version 3.1.x.
Includes:
  • Kafka Multitopic Consumer origin
  • Kafka Producer destination
streamsets-datacollector-apache-kafka_3_2-lib For Kafka version 3.2.x.
Includes:
  • Kafka Multitopic Consumer origin
  • Kafka Producer destination
streamsets-datacollector-apache-kafka_3_3-lib For Kafka version 3.3.x.
Includes:
  • Kafka Multitopic Consumer origin
  • Kafka Producer destination
streamsets-datacollector-apache-kafka_3_4-lib For Kafka version 3.4.x.
Includes:
  • Kafka Multitopic Consumer origin
  • Kafka Producer destination
streamsets-datacollector-apache-kafka_3_5-lib For Kafka version 3.5.x.
Includes:
  • Kafka Multitopic Consumer origin
  • Kafka Producer destination
streamsets-datacollector-apache-kafka_3_6-lib For Kafka version 3.6.x.
Includes:
  • Kafka Multitopic Consumer origin
  • Kafka Producer destination
streamsets-datacollector-apache-pulsar_2-lib For Apache Pulsar version 2.x.
Includes:
  • Pulsar Consumer origin
  • Pulsar Consumer (Legacy) origin
  • Pulsar Producer destination
streamsets-datacollector-apache-solr_6_1_0-lib For Apache Solr version 6.1.

Includes the Solr destination.

streamsets-datacollector-aws-lib For Amazon Web Services 1.11.x.
Includes:
  • Amazon S3 origin
  • Amazon SQS Consumer origin
  • Amazon S3 destination
  • Amazon S3 executor
streamsets-datacollector-aws-secrets-manager-credentialstore-lib For the AWS Secrets Manager credential store.
streamsets-datacollector-azure-keyvault-credentialstore-lib For the Microsoft Azure Key Vault credential store.
streamsets-datacollector-azure-lib For Microsoft Azure.
Includes:
  • Azure Blob Storage origin
  • Azure Data Lake Storage Gen2 origin
  • Azure Data Lake Storage Gen2 (Legacy) origin
  • Azure IoT/Event Hub Consumer origin
  • Azure Blob Storage destination
  • Azure Data Lake Storage Gen2 destination
  • Azure Event Hub Producer destination
  • Azure IoT Hub Producer destination
  • Azure Synapse SQL destination
  • ADLS Gen2 File Metadata executor
streamsets-datacollector-basic-lib
Includes the following origins:
  • CoAP Server
  • Directory
  • File Tail
  • HTTP Client
  • HTTP Server
  • JavaScript Scripting
  • MQTT Subscriber
  • OPC UA Client
  • REST Service
  • SFTP/FTP/FTPS Client
  • TCP Server
  • UDP Multithreaded Source
  • UDP Source
  • WebSocket Client
  • WebSocket Server
Includes the following processors:
  • Base64 Field Decoder
  • Base64 Field Encoder
  • Data Generator
  • Data Parser
  • Delay
  • Expression Evaluator
  • Field Flattener
  • Field Hasher
  • Field Mapper
  • Field Masker
  • Field Merger
  • Field Order
  • Field Pivoter
  • Field Remover
  • Field Renamer
  • Field Replacer
  • Field Splitter
  • Field Type Converter
  • Field Zip
  • Geo IP
  • HTTP Client
  • HTTP Router
  • JavaScript Evaluator
  • JSON Generator
  • JSON Parser
  • Log Parser
  • Record Deduplicator
  • Schema Generator
  • Static Lookup
  • Stream Selector
  • Windowing Aggregator
  • XML Flattener
  • XML Parser
Includes the following destinations:
  • CoAP Client
  • HTTP Client
  • Local FS
  • MQTT Publisher
  • Named Pipe
  • Send Response to Origin
  • SFTP/FTP/FTPS Client
  • Splunk
  • Syslog
  • To Error
  • Trash
  • WebSocket Client
Includes the following executors:
  • Databricks Job Launcher
  • Email
  • Pipeline Finisher
  • Shell
streamsets-datacollector-bigtable-lib For Google Cloud Bigtable.

Includes the Google Bigtable destination.

streamsets-datacollector-cassandra_3-lib For Cassandra 1.2, 2.x, and 3.x.

Includes the Cassandra destination.

streamsets-datacollector-cdp_7_1-lib For Cloudera CDP 7.1.1 through 7.1.7.
Includes:
  • Hadoop FS Standalone origin
  • Kafka Multitopic Consumer origin
  • HBase Lookup processor
  • Hive Metadata processor
  • Kudu Lookup processor
  • Hadoop FS destination
  • HBase destination
  • Hive Metastore destination
  • Kafka Producer destination
  • Kudu destination
  • Solr destination
  • HDFS File Metadata executor
  • Hive Query executor
  • MapReduce executor
  • Spark executor
streamsets-datacollector-cdp_7_1_8-lib For Cloudera CDP 7.1.8.
Includes:
  • Hadoop FS Standalone origin
  • Kafka Multitopic Consumer origin
  • HBase Lookup processor
  • Hive Metadata processor
  • Kudu Lookup processor
  • Hadoop FS destination
  • HBase destination
  • Hive Metastore destination
  • Kafka Producer destination
  • Kudu destination
  • Solr destination
  • HDFS File Metadata executor
  • Hive Query executor
  • MapReduce executor
  • Spark executor
streamsets-datacollector-connx-lib For CONNX.
Includes:
  • CONNX origin
  • CONNX CDC origin
streamsets-datacollector-couchbase_2-lib For Couchcbase SDK 2.x.
Includes:
  • Couchbase Lookup processor
  • Couchbase destination
streamsets-datacollector-couchbase_3-lib For Couchcbase SDK 3.x.
Includes:
  • Couchbase origin
  • Couchbase destination
streamsets-datacollector-crypto-lib For cryptography stages.

Includes the Encrypt and Decrypt Fields processor.

streamsets-datacollector-cyberark-credentialstore-lib For the CyberArk credential store.
streamsets-datacollector-dataformats-lib

Contains parsers and generators for the data formats supported by Data Collector.

streamsets-datacollector-dev-lib For developing and testing pipelines.
Includes:
  • Dev Data Generator origin
  • Dev Random Record origin
  • Dev Raw Data Source origin
  • Dev SDC RPC with Buffering origin
  • Dev Snapshot origin
  • Dev Identity processor
  • Dev Random Error processor
  • Dev Record Creator processor
  • To Event destination
Note: Do not use these stages in production pipelines.
streamsets-datacollector-elasticsearch_5-lib For Elasticsearch 1.x, 2.x, and 5.x.

Includes the Elasticsearch origin and destination.

streamsets-datacollector-elasticsearch_6-lib For Elasticsearch 6.x.

Includes the Elasticsearch origin and destination.

streamsets-datacollector-elasticsearch_7-lib For Elasticsearch 7.x.

Includes the Elasticsearch origin and destination.

streamsets-datacollector-elasticsearch_8-lib For Elasticsearch 8.x.

Includes the Elasticsearch origin and destination.

streamsets-datacollector-google-cloud-lib For Google Cloud.
Includes:
  • Google BigQuery origin
  • Google Cloud Storage origin
  • Google Pub/Sub Subscriber origin
  • Google BigQuery destination
  • Google Cloud Storage destination
  • Google Pub/Sub Publisher destination
  • Google BigQuery executor
  • Google Cloud Storage executor
streamsets-datacollector-google-secret-manager-credentialstore-lib For the Google Secret Manager credential store.
streamsets-datacollector-groovy_2_4-lib For Groovy version 2.4.
Includes:
  • Groovy Scripting origin
  • Groovy Evaluator processor
streamsets-datacollector-groovy_4_0-lib For Groovy version 4.0.
Includes:
  • Groovy Scripting origin
  • Groovy Evaluator processor
streamsets-datacollector-influxdb_0_9-lib For InfluxDB version 0.9 - 1.x.

Includes the InfluxDB destination.

streamsets-datacollector-influxdb_2_0-lib For InfluxDB version 2.x.

Includes the InfluxDB 2.x destination.

streamsets-datacollector-jdbc-branded-oracle-lib For Oracle.

Includes the Oracle destination.

streamsets-datacollector-jdbc-lib For JDBC access to databases.
Includes:
  • JDBC Multitable Consumer origin
  • JDBC Query Consumer origin
  • PostgreSQL CDC Client origin
  • Oracle CDC Client origin
  • SQL Server CDC Client origin
  • SQL Server Change Tracking origin
  • JDBC Lookup processor
  • JDBC Tee processor
  • PostgreSQL Metadata processor
  • SQL Parser processor
  • JDBC Producer destination
  • JDBC Query executor
streamsets-datacollector-jdbc-oracle-lib For Oracle.
Includes:
  • Oracle Bulkload origin
  • Oracle CDC origin
streamsets-datacollector-jdbc-sap-hana-lib For JDBC access to SAP HANA databases.

Includes the SAP HANA Query Consumer origin.

streamsets-datacollector-jks-credentialstore-lib For the Java keystore credential store.
streamsets-datacollector-jms-lib For Java Messaging Services (JMS).

Includes the JMS Consumer origin and JMS Producer destination.

streamsets-datacollector-jython_2_7-lib For Jython version 2.7.x.
Includes:
  • Jython Scripting origin
  • Jython Evaluator processor
streamsets-datacollector-kaitai-lib For Kaitai Struct.

Includes the Kaitai Struct Parser processor.

streamsets-datacollector-kinesis-lib For Amazon Kinesis.
Includes:
  • Kinesis Consumer origin
  • Kinesis Firehose destination
  • Kinesis Producer destination
streamsets-datacollector-mapr_6_1-lib For MapR version 6.1.0.
Includes:
  • MapR DB CDC origin
  • MapR DB JSON origin
  • MapR FS Standalone origin
  • MapR Multitopic Streams Consumer origin
  • MapR DB destination
  • MapR DB JSON destination
  • MapR FS destination
  • MapR FS File Metadata executor
streamsets-datacollector-mapr_6_1-mep6-lib For MapR 6.1.0 with EEP 6.x.
Includes:
  • MapR Streams Consumer origin
  • Hive Metadata processor
  • Hive Metastore destination
streamsets-datacollector-mapr_7_0-lib For HPE Ezmeral Data Fabric 7.0.x.
Includes:
  • MapR DB CDC origin
  • MapR DB JSON origin
  • MapR FS Standalone origin
  • MapR Multitopic Streams Consumer origin
  • MapR DB destination
  • MapR DB JSON destination
  • MapR FS destination
  • MapR FS File Metadata executor
streamsets-datacollector-mapr_7_0-mep8-lib For HPE Ezmeral Data Fabric 7.0.x with EEP 8.x.
Includes:
  • MapR Streams Consumer origin
  • Hive Metadata processor
  • Hive Metastore destination
streamsets-datacollector-mleap-lib For MLeap.

Includes the MLeap Evaluator processor.

streamsets-datacollector-mongodb_3-lib For MongoDB 3.0 with Java driver 3.5.0.
Includes:
  • MongoDB origin
  • MongoDB Oplog origin
  • MongoDB Lookup processor
  • MongoDB destination
streamsets-datacollector-mongodb_4-lib For MongoDB 4.0 with Java driver 3.12.0.
Includes:
  • MongoDB origin
  • MongoDB Oplog origin
  • MongoDB Lookup processor
  • MongoDB destination
streamsets-datacollector-mongodb-atlas-lib For MongoDB Atlas and MongoDB Enterprise Server.
Includes:
  • MongoDB Atlas origin
  • MongoDB Atlas CDC origin
  • MongoDB Atlas destination
streamsets-datacollector-mysql-binlog-lib For MySQL binary logs.

Includes the MySQL Binary Log origin.

streamsets-datacollector-orchestrator-lib For the orchestration stages.
Includes:
  • Cron Scheduler origin
  • Start Jobs origin
  • Control Hub API processor
  • Start Jobs processor
  • Wait for Jobs processor
streamsets-datacollector-postgres-aurora-lib For Amazon Aurora PostgreSQL versions 1 through 4.

Includes the Aurora PostgreSQL CDC Client origin.

streamsets-datacollector-rabbitmq-lib For RabbitMQ version 3.5.6.

Includes the RabbitMQ Consumer origin and RabbitMQ Producer destination.

streamsets-datacollector-redis-lib For Redis versions 2.8 and 3.0.
Includes:
  • Redis Consumer origin
  • Redis Lookup processor
  • Redis destination
streamsets-datacollector-salesforce-lib

For Salesforce.

Includes:
  • Salesforce origin
  • Salesforce Bulk API 2.0 origin
  • Salesforce Lookup processor
  • Salesforce Bulk API 2.0 Lookup processor
  • Salesforce destination
  • Salesforce Bulk API 2.0 destination
  • Tableau CRM destination
streamsets-datacollector-sdc-databricks-lib For Databricks.
Includes:
  • Databricks Delta Lake destination
  • Databricks Query executor
streamsets-datacollector-sdc-snowflake-lib For Snowflake.
Includes:
  • Snowflake Bulk origin
  • Snowflake destination
  • Snowflake File Uploader destination
  • Snowflake executor
streamsets-datacollector-singlestore-lib For SingleStore.

Includes the SingleStore destination.

streamsets-datacollector-stats-lib

StreamSets Control Hub requires that the statistics stage library be installed on each Data Collector.

streamsets-datacollector-tensorflow-lib For TensorFlow.

Includes the TensorFlow Evaluator processor.

streamsets-datacollector-teradata-lib For Teradata.

Includes the Teradata destination.

streamsets-datacollector-thycotic-credentialstore-lib For the Thycotic Secret Server credential store.
streamsets-datacollector-vault-credentialstore-lib For the Hashicorp Vault credential store.
streamsets-datacollector-webclient-impl-okhttp For OkHttp.
Includes:
  • Web Client origin
  • Web Client processor
  • Web Client destination
streamsets-datacollector-wholefile-transformer-lib Includes the Whole File Transformer processor.