Creating the Databases

Install the required database software and create the databases before installing StreamSets Control Hub.

Step 1. Install the Database Software

Control Hub requires a relational database instance and a time series database instance. You can install the database software on the same machine as the Control Hub instance or on a remote machine. For best performance, we recommend installing on remote machines.

Be sure to install the database software on a machine that meets the installation requirements.

Install the required software based on whether you are installing a single Control Hub instance or installing multiple instances for a highly available Control Hub system:

Single Control Hub instance
For the relational database instance, install one of the following:
For the time series database instance, install InfluxDB as described in the InfluxDB documentation.
Multiple Control Hub instances for high availability
For the relational database instance, install one of the following:
For the time series database instance, install InfluxDB Enterprise as described in the InfluxEnterprise documentation.

Setting MariaDB or MySQL Server System Variables

If using MariaDB or MySQL for the relational database instance, set the following MariaDB server system variables or MySQL server system variables to these values on each machine where the database software is installed:
Server System Variable Value
character_set_client

character_set_connection

character_set_database

character_set_server

character_set_system

utf8
collation_connection utf8_general_ci
collation_database

collation_server

utf8_unicode_ci
lower_case_table_names 0

Enabling Authentication and HTTPS for InfluxDB

By default, an InfluxDB installation has authentication and HTTPS disabled. For a production environment, we recommend enabling both authentication and HTTPS.

When you enable authentication, you must create an admin user. Use this admin user when you create the required databases in InfluxDB.

For instructions, see Enabling Authentication and HTTPS Setup in the InfluxDB documentation.

Step 2. Create the Relational Databases

Create a unique database for each Control Hub application in the MariaDB, MySQL, or PostgreSQL relational database instance.

Important: Each application requires a unique database in the relational database instance. Create a database for each application, even if you do not plan to use the functionality offered by that application.

Creating Databases in MariaDB or MySQL

Create a unique database for each application and then create a user with all privileges on these databases.

  1. Connect to MariaDB or MySQL as a user who can create databases.
    You must be an admin user to create databases in MariaDB or MySQL.
  2. Use the following command to create a unique database for each application:
    CREATE DATABASE <database name>;

    You can use any database name. Later, when you set up Control Hub, you specify the database connection details for each application.

    For example, you might create databases with the following names to match each application:

    CREATE DATABASE connection;
    CREATE DATABASE dynamic_preview;
    CREATE DATABASE jobrunner;
    CREATE DATABASE messaging;
    CREATE DATABASE notification;
    CREATE DATABASE pipelinestore;
    CREATE DATABASE policy;
    CREATE DATABASE provisioning;
    CREATE DATABASE reporting;
    CREATE DATABASE scheduler;
    CREATE DATABASE sdp_classification;
    CREATE DATABASE security;
    CREATE DATABASE sla;
    CREATE DATABASE timeseries;
    CREATE DATABASE topology;
    Tip: If you have both a development and production environment using the same relational database instance, create unique database names for each environment. For example, security_dev and security_prod.
  3. Verify that the required databases were successfully created with the correct names.
  4. Create a user with all privileges on these databases.

    When you install Control Hub, you'll configure Control Hub to use this user account to connect to the databases. You can use one user account for all of the databases, or you can create a unique user account for each database.

    For information on creating users for MariaDB, see the MariaDB documentation.

    For information on creating users for MySQL, see the MySQL documentation.

    The commands you use depend on whether database software is installed on the local Control Hub machine or on a remote machine:

    • Local machine - Use the following commands to create another user with all privileges on the jobrunner database:
      CREATE USER 'jobrunner'@'%' identified by 'jobrunner';
      grant all privileges on jobrunner.* to 'jobrunner'@'%';
      CREATE USER 'jobrunner'@'<full host name>' identified by 'jobrunner';
      grant all privileges on jobrunner.* to 'jobrunner'@'<full host name>';
    • Remote machine - Use the following commands to create another user with all privileges on the jobrunner database:
      CREATE USER 'jobrunner'@'%' identified by 'jobrunner';
      grant all privileges on jobrunner.* to 'jobrunner'@'%';
    Repeat the commands for each application database.

Creating Databases in PostgreSQL

Create a unique database for each application and then create a user with all privileges on these databases.

  1. Connect to PostgreSQL as a user who can create databases.
    You must be a superuser or a user with the special CREATEDB privilege to create databases in PostgreSQL. For more information, see the PostgreSQL documentation.
  2. Use the following command to create a unique database for each application:
    CREATE DATABASE <database name> WITH ENCODING = UTF8;

    You can use any database name. Later, when you set up Control Hub, you specify the database connection details for each application.

    For example, you might create databases with the following names to match each application:

    CREATE DATABASE connection WITH ENCODING = 'UTF8';
    CREATE DATABASE dynamic_preview WITH ENCODING = 'UTF8';
    CREATE DATABASE jobrunner WITH ENCODING = 'UTF8';
    CREATE DATABASE messaging WITH ENCODING = 'UTF8';
    CREATE DATABASE notification WITH ENCODING = 'UTF8';
    CREATE DATABASE pipelinestore WITH ENCODING = 'UTF8';
    CREATE DATABASE policy WITH ENCODING = 'UTF8';
    CREATE DATABASE provisioning WITH ENCODING = 'UTF8';
    CREATE DATABASE reporting WITH ENCODING = 'UTF8';
    CREATE DATABASE scheduler WITH ENCODING = 'UTF8';
    CREATE DATABASE sdp_classification WITH ENCODING = 'UTF8';
    CREATE DATABASE security WITH ENCODING = 'UTF8';
    CREATE DATABASE sla WITH ENCODING = 'UTF8';
    CREATE DATABASE timeseries WITH ENCODING = 'UTF8';
    CREATE DATABASE topology WITH ENCODING = 'UTF8';
    Tip: If you have both a development and production environment using the same relational database instance, create unique database names for each environment. For example, security_dev and security_prod.
  3. Verify that the required databases were successfully created with the correct names.
  4. Create a user with all privileges on these databases.

    When you install Control Hub, you'll configure Control Hub to use this user account to connect to the databases. You can use one user account for all of the databases, or you can create a unique user account for each database.

    For information on creating users, see the PostgreSQL documentation.

    For example, use the following commands to create a jobrunner user with all privileges on the jobrunner database:
    CREATE USER jobrunner with password '<password>';
    grant all privileges on database jobrunner to jobrunner;
    Repeat the commands for each application database.

Step 3. Create the Time Series Databases

Create the following databases in the InfluxDB time series database instance:
  • Metrics
  • Application Metrics
  1. Use the following command to start the InfluxDB server:
    service influxdb start
  2. Use the following command to connect to InfluxDB as the admin user that you created when you enabled authentication for InfluxDB:
    influx -username <admin user name> -password <password> -precision rfc3339

    You must be an admin user to create databases in InfluxDB.

    Note: The command assumes that InfluxDB is installed on the local host using the default port 8086. If you changed the defaults during the InfluxDB installation, use the appropriate command to connect. For more information, see the InfluxDB documentation.
  3. Use the following Influx Query Language (InfluxQL) statement to create each database:
    > CREATE DATABASE <database name>

    For example, to create a Metrics database named sch and an Application Metrics database named sch_app, run the following statements:

    > CREATE DATABASE sch
    > CREATE DATABASE sch_app
    Tip: If you have both a development and production environment using the same time series database instance, create unique database names for each environment. For example, sch_dev and sch_prod.
  4. Use the following InfluxQL statement to verify that the databases were successfully created with the correct names:
    > SHOW DATABASES
    Verify that the statement output includes the required databases you just created. For example:
    name: databases
    name
    ----
    _internal
    sch
    sch_app
    Note: The _internal database is created and used by InfluxDB to store internal runtime metrics.
  5. Create an InfluxDB user with all privileges on these time series databases.
    When you install Control Hub, you'll configure Control Hub to use this user account to connect to the databases. You can use one user account for both databases, or you can create a unique user account for each database.
    For example, use the following InfluxQL statements to create another user with all privileges on the Metrics database (sch) and the Application Metrics database (sch_app):
    > CREATE USER <username> WITH PASSWORD '<password>'
    > grant ALL on sch to <username>
    > grant ALL on sch_app to <username>
    > grant read on _internal to <username>