Preparing for the Upgrade

Before upgrading Control Hub, shut down and back up the previous Control Hub version and then back up the Control Hub databases. Depending on the version that you are upgrading from, create the new required databases.

Step 1. Shut Down the Previous Version

Shut down the previous version of Control Hub.

Note: Any registered Data Collectors that are running at the time of the Control Hub shutdown continue to run remote pipeline instances in Control Hub disconnected mode. The Data Collectors maintain the offsets for the running pipelines and update Control Hub with the latest offsets as soon as they reconnect to Control Hub.
To shut down when Control Hub is started as a service, use the required command for your operating system:
  • For CentOS 6.x, Oracle Linux 6.x, Red Hat Enterprise Linux 6.x, or Ubuntu 14.04, use:
    service dpm stop
  • For CentOS 7.x, Oracle Linux 7.x - 8.x, or Red Hat Enterprise Linux 7.x - 8.x, use:
    systemctl stop dpm
To shut down when Control Hub is started manually from the tarball, use the Control Hub process ID in the following command:
kill <process ID>

Step 2. Back Up the Previous Version

Before you install the new version, create a backup of the files in the previous version. That way, you can continue to run the previous version if needed.

Back up the directories defined in the following environment variables in the previous installation:

  • DPM_HOME - The base Control Hub directory that stores the executables and related files.
  • DPM_CONF - The Control Hub configuration directory.
  • DPM_LOG - The Control Hub log directory.

For more information about these environment variables, see Control Hub directories.

For example, if you are upgrading version 3.25.0 from an RPM installation, create a backup of each directory and name the backup directories as follows:
  • /opt/streamsets-dpm-3520
  • /etc/dpm-3250
  • /var/log/dpm-3250

Step 3. Back Up the Databases

Back up the relational and time series databases used by Control Hub.

Back Up the Relational Databases

Back up each Control Hub application database in the MariaDB, MySQL, or PostgreSQL relational database instance.

Back up the MariaDB or MySQL databases

For instructions about backing up MariaDB databases, see the MariaDB documentation. For instructions about backing up MySQL databases, see the MySQL documentation.

For example, if MariaDB or MySQL is installed on a remote machine, you might run the following command for each database to back up the database to a gzip file:
mysqldump --add-drop-table --single-transaction -u <username> -p<password>  <database name> -h <database host> -P <database port> | gzip > <database name>.sql.gz
You'd run the following command to back up the jobrunner database:
mysqldump --add-drop-table --single-transaction -u myuser -pmypassword  jobrunner -h sch.acme.dbs.com -P 3306 | gzip > jobrunner.sql.gz
Back up the PostgreSQL databases

For instructions about backing up PostgreSQL databases, see the PostgreSQL documentation.

For example, you might run the pg_dump utility program for each database to back up the database to a script file:
sudo pg_dump -U <user name> <database name> > <database name>.pgsql
You'd run the following command to back up the jobrunner database:
sudo pg_dump -U myuser jobrunner > jobrunner.pgsql

Back Up the Time Series Databases

Back up the following databases in the time series database instance on InfluxDB:
  • Metrics
  • Application Metrics

For instructions about backing up InfluxDB databases in a development environment, see the InfluxDB documentation.

For instructions about backing up InfluxDB Enterprise databases for a highly available Control Hub in a production environment, see the InfluxDB Enterprise documentation.

For example, if using InfluxDB Enterprise for a highly available Control Hub, you might run the following InfluxDB command to back up each InfluxDB database:
influxd-ctl  backup -db <database name> <path to backup directory>
You'd run the following command to back up the Metrics database named sch:
influxd-ctl  backup -db sch /tmp/backup

Step 4. Create New Databases

Control Hub includes new applications in the following versions:
  • Version 3.0.0 includes a new Provisioning application.
  • Version 3.2.0 includes new Reporting and Scheduler applications.
  • Version 3.5.0 includes new Policy and Classification applications.
  • Version 3.13.0 includes a new Dynamic Preview application.
  • Version 3.19.0 includes a new Connection application.
Depending on the version that you are upgrading from, create the following new relational databases:
Upgrading from Version Required New Databases
2.7.x Provisioning, Reporting, Scheduler, Classification, Policy, Dynamic Preview, Connection
3.0.x or 3.1.x Reporting, Scheduler, Classification, Policy, Dynamic Preview, Connection
3.2.x or 3.3.x Classification, Policy, Dynamic Preview, Connection
3.5.x through 3.12.x Dynamic Preview, Connection
3.13.x through 3.18.x Connection
Important: Each application requires a unique database in the relational database instance. Create a database for each application, even if you do not plan to use the functionality offered by that application.

Creating the Databases in MariaDB or MySQL

Based on the version that you are upgrading from, create a database for each new application and then create a user with all privileges on these databases.

  1. Log in to MariaDB or MySQL as the admin user.
  2. Use the following command to create a unique database for each new application:
    CREATE DATABASE <database name>;

    For example, if upgrading from version 2.7.x, create the new databases with the following names to match the new applications:

    CREATE DATABASE connection;
    CREATE DATABASE dynamic_preview;
    CREATE DATABASE policy;
    CREATE DATABASE provisioning;
    CREATE DATABASE reporting;
    CREATE DATABASE scheduler;
    CREATE DATABASE sdp_classification;
    Tip: If you have both a development and production environment using the same relational database instance, create unique database names for each environment. For example, provisioning_dev and provisioning_prod.
  3. Use the following command to verify that the databases were successfully created with the correct name:
    SHOW databases;
  4. Create a MariaDB or MySQL user with all privileges on the databases.
    When you install Control Hub, you'll configure Control Hub to use this user account to connect to the databases. You can use one user account for all of the databases, or you can create a unique user account for each database.

    The commands you use depend on whether the database software is installed on the local Control Hub machine or on a remote machine:

    • Local machine - Use the following commands to create another user with all privileges on the provisioning database:
      CREATE USER 'provisioning'@'%' identified by 'provisioning';
      grant all privileges on provisioning.* to 'provisioning'@'%';CREATE USER 'provisioning'@'<full host name>' identified by 'provisioning';
      grant all privileges on provisioning.* to 'provisioning'@'<full host name>';
    • Remote machine - Use the following commands to create another user with all privileges on the provisioning database:
      CREATE USER 'provisioning'@'%' identified by 'provisioning';
      grant all privileges on provisioning.* to 'provisioning'@'%';

    Repeat the commands for each new application database.

Creating the Databases in PostgreSQL

Based on the version that you are upgrading from, create a database for each new application and then create a user with all privileges on these databases.

  1. Connect to PostgreSQL as a user who can create databases.
  2. Use the following command to create a unique database for each new application:
    CREATE DATABASE <database name> WITH ENCODING = UTF8;

    For example, if upgrading from version 2.7.x, create the new databases with the following names to match the new applications:

    CREATE DATABASE connection WITH ENCODING = 'UTF8';
    CREATE DATABASE dynamic_preview WITH ENCODING = 'UTF8';
    CREATE DATABASE policy WITH ENCODING = 'UTF8';
    CREATE DATABASE provisioning WITH ENCODING = 'UTF8';
    CREATE DATABASE reporting WITH ENCODING = 'UTF8';
    CREATE DATABASE scheduler WITH ENCODING = 'UTF8';
    CREATE DATABASE sdp_classification WITH ENCODING = 'UTF8';
    Tip: If you have both a development and production environment using the same relational database instance, create unique database names for each environment. For example, provisioning_dev and provisioning_prod.
  3. Verify that the required databases were successfully created with the correct names.
  4. Create a user with all privileges on these databases.

    When you install Control Hub, you'll configure Control Hub to use this user account to connect to the databases. You can use one user account for all of the databases, or you can create a unique user account for each database.

    For information on creating users, see the PostgreSQL documentation.

    For example, use the following commands to create a provisioning user with all privileges on the provisioning database:
    CREATE USER provisioning with password '<password>';
    grant all privileges on database provisioning to provisioning;
    Repeat the commands for each application database.