Core Installation
Users with a StreamSets enterprise account can use the Data Collector core installation.
The core installation is a minimal installation that generally requires installing additional stage libraries to develop pipelines. The core installation allows Data Collector to use less disk space.
To use the Data Collector core installation, you can download the RPM package or the core tarball.
- Basic stage library
- Data formats stage library
- Development stage library
- Statistics stage library
- Windows stage library
You then use the command line interface to install additional stage libraries.
- Origins
-
- CoAP Server
- Directory
- File Tail
- gRPC Client
- HTTP Client
- HTTP Server
- JavaScript Scripting
- MQTT Subscriber
- OPC UA Client
- REST Service
- SFTP/FTP/FTPS Client
- System Metrics
- TCP Server
- UDP Multithreaded Source
- UDP Source
- WebSocket Client
- WebSocket Server
- Windows Event Log
- Processors
-
- Base64 Field Decoder
- Base64 Field Encoder
- Data Generator
- Data Parser
- Delay
- Expression Evaluator
- Field Flattener
- Field Hasher
- Field Mapper
- Field Masker
- Field Merger
- Field Order
- Field Pivoter
- Field Remover
- Field Renamer
- Field Replacer
- Field Splitter
- Field Type Converter
- Field Zip
- Geo IP
- HTTP Client
- HTTP Router
- JavaScript Evaluator
- JSON Generator
- JSON Parser
- Log Parser
- Record Deduplicator
- Schema Generator
- Static Lookup
- Stream Selector
- Windowing Aggregator
- XML Flattener
- XML Parser
- Destinations
-
- CoAP Client
- HTTP Client
- Local FS
- MQTT Publisher
- Named Pipe
- Send Response to Origin
- SFTP/FTP/FTPS Client
- Splunk
- Syslog
- To Error
- Trash
- WebSocket Client
- Executors
-
- Databricks Job Launcher
- Pipeline Finisher
- Shell
Installing the Core RPM Package
Users with a StreamSets enterprise account can install the Data Collector RPM package and start it as a service on CentOS or Red Hat Enterprise Linux. To install the core version of Data Collector, download the RPM package.
After you perform the core installation and launch, install individual stage libraries as needed.
When you install from the RPM package, Data Collector uses the default directories and the default system user and group.
The default system user and group are named sdc
. If an
sdc
user and an sdc
group do not exist on the
machine, the installation creates the user and group for you and assigns them the
next available user ID and group ID.
sdc
user and group, create the user and group before installation
and specify the IDs that you want to use. For example, if you’re installing Data Collector on
multiple machines, you might want to create the system user and group before
installation to ensure that the user ID and group ID are consistent across the
machines.- Access the Data Collector RPM package from the StreamSets Support portal.
-
Download the RPM package for your operating system:
- For CentOS 6, Oracle Linux 6, or Red Hat Enterprise Linux 6, download the RPM EL6 package.
- For CentOS 7, Oracle Linux 7, or Red Hat Enterprise Linux 7, download the RPM EL7 package.
- For Oracle Linux 8 or Red Hat Enterprise Linux 8, download the RPM EL8 package.
-
Use the following command to extract the file to the desired location:
tar xf streamsets-datacollector-<version>-<operating_system>-all-rpms.tar
For example, to extract version 6.0.0 on CentOS 7, use the following command:tar xf streamsets-datacollector-6.0.0-el7-all-rpms.tar
-
Use the following command to install the core RPM package:
yum localinstall streamsets-datacollector-<version>-1.noarch.rpm
For example, to install version 6.0.0, use the following command:yum localinstall streamsets-datacollector-6.0.0-1.noarch.rpm
-
To start Data Collector as a service, use the required command for your operating system:
- For CentOS 6, Oracle Linux 6, or Red Hat Enterprise Linux
6, use:
service sdc start
- For later operating systems,
use:
systemctl start sdc
- For CentOS 6, Oracle Linux 6, or Red Hat Enterprise Linux
6, use:
Installing the Core Tarball
Users with a StreamSets enterprise account can install the Data Collector core tarball.
To install the core version of Data Collector, download the core tarball. After you perform the core installation and launch, install individual stage libraries as needed.
- Download the core Data Collector tarball from the StreamSets Support portal.
-
Use one of the following installation methods to install the core Data Collector:
- Install and launch manually. For details, see Full Installation and Launch (Manual Start).
- Install and launch as a service. For details, see Installing from the Tarball for Systems Using SysV Init or Installing from the Tarball for Systems Using Systemd Init.