Creating the Databases
Install the required database software and create the databases before installing StreamSets Control Hub.
Step 1. Install the Database Software
Control Hub requires a relational database instance and a time series database instance. You can install the database software on the same machine as the Control Hub instance or on a remote machine. For best performance, we recommend installing on remote machines.
Be sure to install the database software on a machine that meets the installation requirements.
Install the required software based on whether you are installing a single Control Hub instance or installing multiple instances for a highly available Control Hub system:
- Single Control Hub instance
- For the relational database instance, install one of the following:
- MariaDB as described in the MariaDB documentation.
- MySQL as described in the MySQL documentation.
- PostgreSQL as described in the PostgreSQL documentation.
- Multiple Control Hub instances for high availability
- For the relational database instance, install one of the following:
- MariaDB Galera Cluster as described in the MariaDB documentation.
- MySQL Enterprise High Availability as described in the MySQL Enterprise High Availability documentation.
- PostgreSQL with high availability enabled as described in the PostgreSQL high availability documentation.
Setting MariaDB or MySQL Server System Variables
Server System Variable | Value |
---|---|
character_set_client character_set_connection character_set_database character_set_server character_set_system |
utf8 |
collation_connection | utf8_general_ci |
collation_database collation_server |
utf8_unicode_ci |
lower_case_table_names | 0 |
Enabling Authentication and HTTPS for InfluxDB
By default, an InfluxDB installation has authentication and HTTPS disabled. For a production environment, we recommend enabling both authentication and HTTPS.
When you enable authentication, you must create an admin user. Use this admin user when you create the required databases in InfluxDB.
For instructions, see Enabling Authentication and HTTPS Setup in the InfluxDB documentation.
Step 2. Create the Relational Databases
Create a unique database for each Control Hub application in the MariaDB, MySQL, or PostgreSQL relational database instance.
Creating Databases in MariaDB or MySQL
Create a unique database for each application and then create a user with all privileges on these databases.
-
Connect to MariaDB or MySQL as a user who can create databases.
You must be an admin user to create databases in MariaDB or MySQL.
-
Use the following command to create a unique database for each
application:
CREATE DATABASE <database name>;
You can use any database name. Later, when you set up Control Hub, you specify the database connection details for each application.
For example, you might create databases with the following names to match each application:
CREATE DATABASE connection; CREATE DATABASE dynamic_preview; CREATE DATABASE jobrunner; CREATE DATABASE messaging; CREATE DATABASE notification; CREATE DATABASE pipelinestore; CREATE DATABASE policy; CREATE DATABASE provisioning; CREATE DATABASE reporting; CREATE DATABASE scheduler; CREATE DATABASE sdp_classification; CREATE DATABASE security; CREATE DATABASE sla; CREATE DATABASE timeseries; CREATE DATABASE topology;
Tip: If you have both a development and production environment using the same relational database instance, create unique database names for each environment. For example, security_dev and security_prod. - Verify that the required databases were successfully created with the correct names.
-
Create a user with all privileges on these databases.
When you install Control Hub, you'll configure Control Hub to use this user account to connect to the databases. You can use one user account for all of the databases, or you can create a unique user account for each database.
For information on creating users for MariaDB, see the MariaDB documentation.
For information on creating users for MySQL, see the MySQL documentation.
The commands you use depend on whether database software is installed on the local Control Hub machine or on a remote machine:
- Local machine - Use the following commands to
create another user with all privileges on the jobrunner
database:
CREATE USER 'jobrunner'@'%' identified by 'jobrunner'; grant all privileges on jobrunner.* to 'jobrunner'@'%'; CREATE USER 'jobrunner'@'<full host name>' identified by 'jobrunner'; grant all privileges on jobrunner.* to 'jobrunner'@'<full host name>';
- Remote machine - Use the following commands to
create another user with all privileges on the jobrunner
database:
CREATE USER 'jobrunner'@'%' identified by 'jobrunner'; grant all privileges on jobrunner.* to 'jobrunner'@'%';
Repeat the commands for each application database. - Local machine - Use the following commands to
create another user with all privileges on the jobrunner
database:
Creating Databases in PostgreSQL
Create a unique database for each application and then create a user with all privileges on these databases.
-
Connect to PostgreSQL as a user who can create databases.
You must be a superuser or a user with the special CREATEDB privilege to create databases in PostgreSQL. For more information, see the PostgreSQL documentation.
-
Use the following command to create a unique database for each
application:
CREATE DATABASE <database name> WITH ENCODING = UTF8;
You can use any database name. Later, when you set up Control Hub, you specify the database connection details for each application.
For example, you might create databases with the following names to match each application:
CREATE DATABASE connection WITH ENCODING = 'UTF8'; CREATE DATABASE dynamic_preview WITH ENCODING = 'UTF8'; CREATE DATABASE jobrunner WITH ENCODING = 'UTF8'; CREATE DATABASE messaging WITH ENCODING = 'UTF8'; CREATE DATABASE notification WITH ENCODING = 'UTF8'; CREATE DATABASE pipelinestore WITH ENCODING = 'UTF8'; CREATE DATABASE policy WITH ENCODING = 'UTF8'; CREATE DATABASE provisioning WITH ENCODING = 'UTF8'; CREATE DATABASE reporting WITH ENCODING = 'UTF8'; CREATE DATABASE scheduler WITH ENCODING = 'UTF8'; CREATE DATABASE sdp_classification WITH ENCODING = 'UTF8'; CREATE DATABASE security WITH ENCODING = 'UTF8'; CREATE DATABASE sla WITH ENCODING = 'UTF8'; CREATE DATABASE timeseries WITH ENCODING = 'UTF8'; CREATE DATABASE topology WITH ENCODING = 'UTF8';
Tip: If you have both a development and production environment using the same relational database instance, create unique database names for each environment. For example, security_dev and security_prod. - Verify that the required databases were successfully created with the correct names.
-
Create a user with all privileges on these databases.
When you install Control Hub, you'll configure Control Hub to use this user account to connect to the databases. You can use one user account for all of the databases, or you can create a unique user account for each database.
For information on creating users, see the PostgreSQL documentation.
For example, use the following commands to create a jobrunner user with all privileges on the jobrunner database:CREATE USER jobrunner with password '<password>'; grant all privileges on database jobrunner to jobrunner;
Repeat the commands for each application database.
Step 3. Create the Time Series Databases
- Metrics
- Application Metrics
-
Use the following command to start the InfluxDB server:
service influxdb start
-
Use the following command to connect to InfluxDB as the admin user that you
created when you enabled authentication for InfluxDB:
influx -username <admin user name> -password <password> -precision rfc3339
You must be an admin user to create databases in InfluxDB.
Note: The command assumes that InfluxDB is installed on the local host using the default port 8086. If you changed the defaults during the InfluxDB installation, use the appropriate command to connect. For more information, see the InfluxDB documentation. -
Use the following Influx Query Language (InfluxQL) statement to create each
database:
> CREATE DATABASE <database name>
For example, to create a Metrics database named sch and an Application Metrics database named sch_app, run the following statements:
> CREATE DATABASE sch > CREATE DATABASE sch_app
Tip: If you have both a development and production environment using the same time series database instance, create unique database names for each environment. For example, sch_dev and sch_prod. -
Use the following InfluxQL statement to verify that the databases were
successfully created with the correct names:
> SHOW DATABASES
Verify that the statement output includes the required databases you just created. For example:name: databases name ---- _internal sch sch_app
Note: The _internal database is created and used by InfluxDB to store internal runtime metrics. -
Create an InfluxDB user with all privileges on these time series
databases.
When you install Control Hub, you'll configure Control Hub to use this user account to connect to the databases. You can use one user account for both databases, or you can create a unique user account for each database.For example, use the following InfluxQL statements to create another user with all privileges on the Metrics database (sch) and the Application Metrics database (sch_app):
> CREATE USER <username> WITH PASSWORD '<password>' > grant ALL on sch to <username> > grant ALL on sch_app to <username> > grant read on _internal to <username>